SlideShare a Scribd company logo
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org
Page | 15
Paper Publications
Heart Diseases Diagnosis Using Data Mining
Techniques
1
Dr. R. Muralidharan, 2
D. Vinodhini
1
M.Sc. M.Phil. MCA., Ph.D. Vice Principal & HOD in CS Rathinam College of Arts & Science Coimbatore, India
2
M.Sc., Department of Computer Science, Rathinam College of Arts & Science, Coimbatore, India
Abstract: In health care domain, data mining plays a vital role for predicting diseases. For detecting a disease
number of tests should be required from the patient but number of test should be reduced while using data mining
techniques. The data mining technique analyze the test parameters and it concludes the associative relation
between the parameters that reduces the tests and the reduced test plays key role in time and performance. In this
study medical terms such as sex, blood pressure, and cholesterol like nineteen input attributes are used. In this
paper association among various attributes which are the causative factors of heart diseases are analyzed. The
patient’s records are observed before prediction and the factors are grouped as per its severity level. In this system
the level of causative factors are categorized using K-Means clustering technique and it distinguishes the risky and
non-risky factors. Frequent risk factors are mined from the clinical heart database using Apriori algorithm. The
risk factors are taken for this study to predict the risk level and find the co-ordination among the factors that helps
the medical people to predict the disease with minimum tests and treatments.
Keywords: Associative Mining algorithm, Data mining, Heart disease, K-Means Clustering.
I. INTRODUCTION
Data mining is the extraction of hidden predictive information from large databases, is a powerful new technology with
great potential to help companies focus on the most important information in their data warehouses. Data Mining is a very
crucial research domain in recent research world. The techniques are useful to elicit significant and utilizable knowledge
which can be perceived by many individuals. Data mining programs consists of diverse methodologies which are
predominantly produced and used by commercial enterprises and biomedical researchers. These techniques are well
disposed towards their respective knowledge domain. The heart is very important part of human body, which pumps
blood into the entire body. If circulation of blood in body is inefficient the organs like brain suffer and if heart stops
working altogether, death occurs within minutes. Life is completely dependent on efficient working of the heart. The term
Heart disease refers to disease of heart & blood vessel system within it. This study is closely related to heart diseases
prediction and it provides an awareness to the patient to prevent them from unexpected heart problem. This paper
analyzes the heart disease predictions using classification algorithms. Medicinal data mining has high potential for
exploring the unknown patterns in the data sets of medical domain. These patterns can be used for medical analysis in raw
medical data. Heart disease was the major cause of casualties in the world. Half of the deaths occur in the countries like
India, United States are due to cardiovascular diseases. Medical data mining techniques like Association Rule Mining,
Clustering, and Classification Algorithms are applied to analyze the different kinds of heart based problems.
Data mining is a process of extracting/equating the meaningful data from the large database by using different tools and
techniques. Data mining (sometimes called information or knowledge discovery) is the process of evaluating data from
different outlooks and summarizing it into useful information - information that can be used to increase revenue, cuts
costs, or both. Data mining software is one of a number of diagnostic tools for analyzing data. It allows users to
investigate data from many different dimensions or angles, categorize it, and summarize the relationships identified.
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org
Page | 16
Paper Publications
Technically, data mining is the process of finding correlations or forms among dozens of fields in large relational
databases. Data mining or knowledge discovery in databases (KDD) is an interdisciplinary field where we integrate
techniques from different fields including data base systems, statistics, mathematics, high performance computing,
artificial intelligence and machine learning.
Fig. 1 Basic idea about KDD
This data mining techniques are used in health care domain for prediction of problem, disease detection, optimizing the
solution and so on. Recent technologies are nowadays able to provide a lot of information on health-related activities,
which can then be analyzed in order to find important information and to collect relevant information. This data mining
techniques are used for disease detection, pattern recognition by using multiple application. Data mining is about to
identify the similarities between searching the valuable business information from the large database systems such as
finding linked products in gigabytes of store scanner data or the mining a mountain for a vein of valuable dataset. Both
kind of processes required either shifting through an immense amount of material, or to perform the search intelligently so
that exactly match will be performed. Data mining can be done on a database whose size and quality are sufficient. Data
mining techniques are often used to study health factors.
The purpose of the study is to analyze the human health using classification techniques and predict the risk level of heart
diseases based on the health factors. The study applied data mining methods like k-means and associative mining
facilitate an improvement in the interpretation of the clinical data set.
II. REVIEW OF LITERATURE
The paper titled “The Survey of Data Mining Applications and Feature Scope” - focused a variety of techniques,
approaches and different areas of the research which are helpful and marked as the important field of data mining
Technologies. As they are aware that many MNC‟s and large organizations are operated in different places of the different
countries. Each place of operation may generate large volumes of data. Corporate decision makers require access from all
such sources and take strategic decisions. The data warehouse is used in the significant business value by improving the
effectiveness of managerial decision-making. In an uncertain and highly competitive business environment, the value of
strategic information systems such as these are easily recognized however in today‟s business environment, efficiency or
speed is not the only key for competitiveness. These types of huge amount of data‟s are available in the form of tera- to
peta-bytes which has drastically changed in the areas of science and engineering. To analyze, manage and make a
decision of such type of huge amount of data we need techniques called the data mining which will transforming in many
fields. They imparts more number of applications of data mining and also focuses scope of the data mining which will
helpful in the further research.
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org
Page | 17
Paper Publications
“A Literature Review of Data Mining Techniques Used in Healthcare Databases” - presented an overview of the current
research being carried out using the data mining techniques for the diagnosis and prognosis of various diseases. The goal
of this study is to identify the well-performing data mining algorithms used on medical databases. The following
algorithms have been identified: Decision Trees, Support Vector Machine, Artificial neural networks and their Multilayer
Perceptron model, Naive Bayes, Apriori, Fuzzy Rules. Analyses show that it is very difficult to name a single data mining
algorithm as the most suitable for the diagnosis and/or prognosis of diseases. At times some algorithms perform better
than others, but there are cases when a combination of the best properties of some of the aforementioned algorithms
together results more effective.
“An Empirical study on prediction of Heart disease using Classification Data mining techniques” - the use of pattern
recognition and data mining techniques into risk prediction models in the clinical domain of cardiovascular medicine is
proposed. The data is to be modelled and classified by using classification data mining technique. Some of the limitations
of the conventional medical scoring systems are that there is a presence of intrinsic linear combinations of variables in the
input set and hence they are not adept at modelling nonlinear complex interactions in medical domains. This limitation is
handled in this research by use of classification models which can implicitly detect complex nonlinear relationships
between dependent and independent variables as well as the ability to detect all possible interactions between predictor
variables.
“Analysis of Attribute Association in Heart Disease Using Data Mining Techniques” - In data mining association rule
mining represents a promising technique to find hidden patterns in large data bases. The main issue about mining
association rules in a medical data is the large number of rules that are discovered, most of which are irrelevant. A rule-
based decision support system (DSS) is presented for the diagnosis of coronary vascular disease (CVD). The dataset used
for the DSS generation and evaluation consists of 1897 subjects, each one characterized by 21 features, including
demographic and history data, as well as laboratory examinations. Such number of rules makes the search slow. However,
not all of the generated rules are interesting, and some rules may be ignored. In medical terms, association rules relate
disease data measures the patient risk factors and occurrence of the disease. Association rule medical significance is
evaluated with the usual support and confidence metrics. Association rules are compared to predictive rules mined with
decision trees, a well-known machine learning technique. In this paper [12] they proposed a new system to find the
strength of association among the attributes of a given data set.
“A Fuzzification Approach for Prediction of Heart Disease” - Data Mining operations and approaches are the
improvement over the statistical methods that enables a user to perform the future analysis based on current dataset. One
of such analysis provided by data mining approaches is the predication based analysis. In their work [13], the heart
disease prediction system is designed. The heart disease prediction is actually an expert system application which requires
the authenticated dataset to process. A Fuzzy based soft computing approach is been implemented on multiple parameters
to predict the heart disease. In this paper [13], the earlier work done in the area of medical disease prediction is studied as
well a new fuzzy rule based approach is suggested to perform the heart disease prediction.
“Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction” - The healthcare
environment is still “information rich but knowledge poor”. There is a wealth of data available within the healthcare
systems. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. This
research paper [14] intends to provide a survey of current techniques of knowledge discovery in databases using data
mining techniques that are in use in today‟s medical research particularly in Heart Disease Prediction. Number of
experiment has been conducted to compare the performance of predictive data mining technique on the same dataset and
the outcome reveals that Decision Tree outperforms and some time Bayesian classification is having similar accuracy as
of decision tree but other predictive methods like Neural Networks, Classification based on clustering are not performing
well. The second conclusion is that the accuracy of the Decision Tree and Bayesian Classification further improves after
applying genetic algorithm to reduce the actual data size to get the optimal subset of attribute sufficient for heart disease
prediction.
A Heart Disease Prediction Model using Decision Tree - In this paper [15], they developed a heart disease prediction
model that can assist medical professionals in predicting heart disease status based on the clinical data of patients. Firstly,
they select 14 important clinical features, i.e., age, sex, chest pain type, Blood Pressure, cholesterol, fasting blood sugar,
resting ECG, max heart rate, exercise induced angina, old peak, slope, number of vessels colored and diagnosis of heart
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org
Page | 18
Paper Publications
disease. Secondly, they developed a prediction model using J48 decision tree for classifying heart disease based on these
clinical features against un-pruned, pruned and pruned with reduced error pruning approach. Finally, the accuracy of
Pruned J48 Decision Tree with Reduced Error Pruning Approach is more better then the simple Pruned and Un-pruned
approach. Their result obtained that which shows that fasting blood sugar is the most important attribute which gives
better classification against the other attributes but its gives not better accuracy.
III. RESEARCH METHODOLOGIES
In this work, an architecture data mining technique based heart disease prediction system combining the prediction system
with mining technology was used. In this model we have used one of the classification algorithms called association
mining.
Simulated heart disease dataset containing 1000 patient records are used for this research work. This dataset contains 19
symptoms as shown in Table 1. They are the symptoms of various heart diseases.
TABLE I. Heart Disease Attributes for the Study
S. No Attributes Values
1 Male and Female
Age < 30 , Age >30 to <50, Age>50 and Age
<70, Age>70.
2 Smoking Never , Past, Current
3 Overweight Yes, No
4 Alcohol Intake Never , Past, Current
5 High salt diet Yes, No
6 High saturated fat diet Yes, No
7 Exercise Never, Regular
8 Sedentary Lifestyle/inactivity Yes, No
9 Hereditary Yes, No
10 Bad cholesterol
Very High >200, High 160 to 200,
Normal <160
11 Blood Pressure
Normal (130/89), Low (< 119/79)
High (>200/160)
12 Blood sugar
High(>120&<400), Normal(>90&<120), Low (
<90)
13 Heart Rate
Low (< 60bpm), Normal (60 to 100)
High (>100bpm)
14 Defect type Normal, Fixed, Reversible defect
15 Chest pain type
typical type 1, typical type angina
non-angina pain, asymptomatic
16 Resting electrographic results
Normal having ST_T wave abnormal left
ventricular hypertrophy
17
number of major vessels colored by
fluoroscopy
0-3 values
18 Dry or persistent Cough Yes, No
19 Skin rashes or Unusual Spots Yes, No
The data and item sets that occur frequently in the data base are known as frequent patterns. The frequent patterns that is
most significantly related to specific heart disease types and are helpful in predicting the disease and its type is known as
Significant frequent pattern. It has four levels of risk like low level, intermediate level, high level and very high level.
Based on the predicted risk values the range of risk will be assigned. In this study every individual health factors which
are taken for the study is analyzed and the causative factors are classified using data mining technique.
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org
Page | 19
Paper Publications
Association Rules:
Association rules are if / then statement that help uncover relationships between seemingly unrelated data in a relational
database or other information repository. An association rule has two parts an antecedent (if) and a consequent (then). An
antecedent is an item found the data and consequence t is an item that is found in combination with the antecedents.
Association rules are created y analyzing data for frequent if / then patterns and using the criteria support and confidence
to identify the most important relationships. They are divided into separate categories in the data mining and used in the
Weka to perform the operations.
Apriori Algorithm:
The Apriori algorithm is a popular and foundational member for correlation based „Data Mining Kernels‟ used today. It
is used to process the data into more useful forms, in particular connection between set of items.
Dataset:
The clinical data sets are gathered from various health care centres. From the collected records data preprocessing is
applied to carry out the factor which are relevant for this study. Nearly 1000 health records are gathered and the data are
pre-processed. The following table shows the sample pre-processed dataset.
TABLE II. Sample Data Table
Attributes Values Values Values
Gender Male Male Female
Age 63 68 45
Smoking Yes No No
Weight (kgs.) 70 56 65
Alcohol Intake Yes No No
High salt diet Yes No Yes
High saturated fat diet No Yes No
Exercise No Yes No
Sedentary Lifestyle/inactivity Yes No Yes
Hereditary No Yes No
Bad cholesterol 170 160 180
Blood Pressure 130/89 160/100 140/90
Blood sugar 100 120 110
Heart Rate (bpm) 70 80 85
Defect type Normal Fixed Normal
Chest pain type typical type 1 No typical type 1
Resting electrographic results Normal Normal Abnormal
Number of major vessels colored by
fluoroscopy
0 0 0
Dry or persistent Cough Yes No No
Skin rashes or Unusual Spots No No No
The above table shows sample dataset. First the data set is converted as binary data set and it is analyzed using Weka
tool. This binary data set illustrates patients‟ health information in binary mode. Collections of Patient medical details
used for transaction databases and set of associations can be represented as binary incidence matrices with columns
corresponding to the factors and rows corresponding to the Patients. The matrix entries represent presence (1) or absence
(0) of a risk factor in a particular patient.
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org
Page | 20
Paper Publications
TABLE III. Binary Data Set
Attributes Values Values Values
Age 1 1 0
Smoking 1 0 0
Weight (kgs.) 1 0 1
Alcohol Intake 1 0 0
High salt diet 1 0 1
High saturated fat diet 0 1 0
Exercise 0 1 0
Sedentary Lifestyle/inactivity 1 0 1
Hereditary 0 1 0
Bad cholesterol 1 0 1
Blood Pressure 0 1 0
Blood sugar 0 0 0
Heart Rate (bpm) 0 0 0
Defect type 0 1 0
Chest pain type 1 0 1
Resting electrographic results 0 0 1
number of major vessels colored by fluoroscopy 0 0 0
Dry or persistent Cough 1 0 0
Skin rashes or Unusual Spots 0 0 0
Data Preprocess:
Fig.3.1 Data Preprocess
The Fig. 3.1 shows the attributes which are involving in the study. There are nineteen factors are taken for this analysis.
The attributes are associated with one and another. This analysis find the support and confidence level of heart diseases
according various factors.
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org
Page | 21
Paper Publications
IV. OBSERVATIONS AND RESULTS
When the apriori algorithm was implemented on the healthcare.arff file, it produced best association rules for that
healthcare.arff or dataset. Now the observation took place, to observe the result of the same operation on the same dataset
by the use of weka tool.
Fig.3.2 (a) Apriori Result
Fig.3.2 (b) Apriori Result
For a given data set, figure 3.2 (a & b) shows Apriori algorithm mined risk factors that have minimum support of all
transactions and it provides 10 best association rules.
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org
Page | 22
Paper Publications
V. CONCLUSION AND FUTURE ENHANCEMENT
Healthcare data set includes the patients‟ health information. The information is classified as hereditary, habitual and
medical test details. In which the seventeen causative factors which are gathered from domain experts are obtained for the
study. Factors are analyzed and the co-occurrence of the factors is found by generating candidate item set and pruning.
Finally the supportive measure of the combined factors is obtained through Apriori. For a given patient records risk and
non-risk factors are grouped using K-means clustering. The ranges of the causative factors are analyzed to predict the risk
level of the heart disease. It is no doubt that Apriori algorithm successfully finds the frequent elements from the database.
In future the work can be expanded and enhanced for the automation of various types of disease prediction. It also
extended to find various types of diseases with the use of these factors.
REFERENCES
[1] Neelamadhab Padhy, Dr. Pragnyaban Mishra , and Rasmita Panigrahi, “The Survey of Data Mining Applications
And Feature Scope”, International Journal of Computer Science, Engineering and Information Technology.
[2] Elma Kolçe, Neki Frasheri, “A Literature Review of Data Mining Techniques Used in Healthcare Databases”, ICT
Innovations 2012 Web Proceedings - Poster Session ISSN.
[3] T.John Peter., “An Empirical study on prediction of Heart disease using Classification Data mining techniques”
IEEE-International Conference On Advances In Engineering, Science And Management.
[4] K.Srinivas., G.Raghavendra Rao and A.Govardhan., “Analysis of Attribute Association in Heart Disease Using Data
Mining Techniques”, International Journal of Engineering Research and Applications.
[5] Nitika, Madan Lal Yadav, “A Fuzzification Approach for Prediction of Heart Disease”, International Journal of
Engineering Trends and Technology (IJETT).
[6] Jyoti Soni Ujma Ansari Dipesh Sharma., “Predictive Data Mining for Medical Diagnosis: An Overview of Heart
Disease Prediction”, International Journal of Computer Applications (IJCA).
[7] Atul Kumar Pandey, Prabhat Pandey, K.L., Jaiswal, Ashish Kumar Sen, “A Heart Disease Prediction Model using
Decision Tree”, IOSR Journal of Computer Engineering.

More Related Content

PDF
prediction of heart disease using machine learning algorithms
INFOGAIN PUBLICATION
 
PDF
Data Mining and Life Science: A Survey
rahulmonikasharma
 
PDF
Ijarcet vol-2-issue-4-1393-1397
Editor IJARCET
 
PDF
A comparative study on remote tracking of parkinson’s disease progression usi...
ijfcstjournal
 
PPTX
Stroke Prediction
MamathaGuntu1
 
PDF
Prediction of Heart Disease Using Data Mining Techniques- A Review
IRJET Journal
 
PDF
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION
IJCI JOURNAL
 
PDF
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
Healthcare consultant
 
prediction of heart disease using machine learning algorithms
INFOGAIN PUBLICATION
 
Data Mining and Life Science: A Survey
rahulmonikasharma
 
Ijarcet vol-2-issue-4-1393-1397
Editor IJARCET
 
A comparative study on remote tracking of parkinson’s disease progression usi...
ijfcstjournal
 
Stroke Prediction
MamathaGuntu1
 
Prediction of Heart Disease Using Data Mining Techniques- A Review
IRJET Journal
 
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION
IJCI JOURNAL
 
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
Healthcare consultant
 

What's hot (19)

PDF
IRJET - An Effective Stroke Prediction System using Predictive Models
IRJET Journal
 
PDF
H0333039042
theijes
 
PDF
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
Editor IJCATR
 
PDF
IRJET- Develop Futuristic Prediction Regarding Details of Health System for H...
IRJET Journal
 
PDF
IRJET- Heart Disease Prediction System
IRJET Journal
 
DOCX
Using AI to Predict Strokes
EMMAIntl
 
PDF
A Heart Disease Prediction Model using Logistic Regression
ijtsrd
 
PDF
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase
ijtsrd
 
PDF
HQR Framework optimization for predicting patient treatment time in big data
dbpublications
 
PDF
Propose a Enhanced Framework for Prediction of Heart Disease
IJERA Editor
 
PDF
HQR Framework optimization for predicting patient treatment time in big data
dbpublications
 
PDF
Heart Disease Prediction Using Data Mining Techniques
IJRES Journal
 
PDF
Framework for efficient transformation for complex medical data for improving...
IJECEIAES
 
PDF
Ascendable Clarification for Coronary Illness Prediction using Classification...
ijtsrd
 
PDF
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD Editor
 
DOCX
System proposal and crs model design applying
Finalyear Projects
 
PDF
Heart disease prediction
Ariful Haque
 
PDF
Big Data Analytics for Healthcare
Chandan Reddy
 
PDF
Psdot 14 using data mining techniques in heart
ZTech Proje
 
IRJET - An Effective Stroke Prediction System using Predictive Models
IRJET Journal
 
H0333039042
theijes
 
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
Editor IJCATR
 
IRJET- Develop Futuristic Prediction Regarding Details of Health System for H...
IRJET Journal
 
IRJET- Heart Disease Prediction System
IRJET Journal
 
Using AI to Predict Strokes
EMMAIntl
 
A Heart Disease Prediction Model using Logistic Regression
ijtsrd
 
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase
ijtsrd
 
HQR Framework optimization for predicting patient treatment time in big data
dbpublications
 
Propose a Enhanced Framework for Prediction of Heart Disease
IJERA Editor
 
HQR Framework optimization for predicting patient treatment time in big data
dbpublications
 
Heart Disease Prediction Using Data Mining Techniques
IJRES Journal
 
Framework for efficient transformation for complex medical data for improving...
IJECEIAES
 
Ascendable Clarification for Coronary Illness Prediction using Classification...
ijtsrd
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD Editor
 
System proposal and crs model design applying
Finalyear Projects
 
Heart disease prediction
Ariful Haque
 
Big Data Analytics for Healthcare
Chandan Reddy
 
Psdot 14 using data mining techniques in heart
ZTech Proje
 
Ad

Viewers also liked (11)

PDF
Response of Maize (Zea mays L.) for Moisture Stress Condition at Different Gr...
paperpublications3
 
PDF
Influence of farmer characteristics on the production of groundnuts, a case o...
paperpublications3
 
PDF
The Optimization of the Generalized Coplanar Impulsive Maneuvers (Two Impulse...
paperpublications3
 
PDF
Effect of Passive Damping on the Performance of Buck Converter for Magnet Load
paperpublications3
 
PDF
EFFECTS OF SMOKING IN THE PUBLIC PLACES: A PROPOSAL FOR SAFE SMOKING PLACES
paperpublications3
 
PDF
Electricity Production by Microorganisms
paperpublications3
 
PDF
Intercropping of Haricot Bean (Phaseolus vulgaris L.) with stevia (Stevia reb...
paperpublications3
 
PDF
SURVEY ON POTENTIALS AND CONSTRAINTS OF SHADE TREE SPECIES FOR ARABICA COFFEE...
paperpublications3
 
PDF
Spearmint (Mentha spicata L.) Response to Deficit Irrigation
paperpublications3
 
PDF
Influence of Farmer Level of Education on the Practice of Improved Agricultur...
paperpublications3
 
PDF
Factors Influencing Willingness to Recycle E-Waste in Kisumu City Central Bus...
paperpublications3
 
Response of Maize (Zea mays L.) for Moisture Stress Condition at Different Gr...
paperpublications3
 
Influence of farmer characteristics on the production of groundnuts, a case o...
paperpublications3
 
The Optimization of the Generalized Coplanar Impulsive Maneuvers (Two Impulse...
paperpublications3
 
Effect of Passive Damping on the Performance of Buck Converter for Magnet Load
paperpublications3
 
EFFECTS OF SMOKING IN THE PUBLIC PLACES: A PROPOSAL FOR SAFE SMOKING PLACES
paperpublications3
 
Electricity Production by Microorganisms
paperpublications3
 
Intercropping of Haricot Bean (Phaseolus vulgaris L.) with stevia (Stevia reb...
paperpublications3
 
SURVEY ON POTENTIALS AND CONSTRAINTS OF SHADE TREE SPECIES FOR ARABICA COFFEE...
paperpublications3
 
Spearmint (Mentha spicata L.) Response to Deficit Irrigation
paperpublications3
 
Influence of Farmer Level of Education on the Practice of Improved Agricultur...
paperpublications3
 
Factors Influencing Willingness to Recycle E-Waste in Kisumu City Central Bus...
paperpublications3
 
Ad

Similar to Heart Diseases Diagnosis Using Data Mining Techniques (20)

PDF
SURVEY OF DATA MINING TECHNIQUES USED IN HEALTHCARE DOMAIN
ijistjournal
 
PDF
SURVEY OF DATA MINING TECHNIQUES USED IN HEALTHCARE DOMAIN
ijistjournal
 
PDF
Ex33900906
IJERA Editor
 
PDF
Ex33900906
IJERA Editor
 
PDF
Benefits of Big Data in Health Care A Revolution
ijtsrd
 
PPTX
HEALTH PREDICTION ANALYSIS USING DATA MINING
Ashish Salve
 
PPTX
Data science in healthcare-Assignment 2.pptx
ArpitaDebnath20
 
PDF
Clustering algorithms for analysing electronic medical record: A mapping study
IAESIJAI
 
PDF
Paper id 36201506
IJRAT
 
PDF
50120140506011
IAEME Publication
 
PDF
Data Science in Healthcare
ijtsrd
 
PPTX
Medicine as a data science
improvemed
 
PPTX
Data science in healthcare-Assignment 2.pptx
ArpitaDebnath20
 
PDF
An Account of Critical Data Analysis in Medical_2nd.pdf
PrabahaGoswami
 
PDF
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
rahulmonikasharma
 
PDF
Biomedical Informatics
improvemed
 
PDF
IRJET- Review on Knowledge Discovery and Analysis in Healthcare using Dat...
IRJET Journal
 
PDF
Medicine as data science
improvemed
 
PDF
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
PDF
1-s2.0-S0167923620300944-main.pdf
Deenadayalan Thanigaimalai
 
SURVEY OF DATA MINING TECHNIQUES USED IN HEALTHCARE DOMAIN
ijistjournal
 
SURVEY OF DATA MINING TECHNIQUES USED IN HEALTHCARE DOMAIN
ijistjournal
 
Ex33900906
IJERA Editor
 
Ex33900906
IJERA Editor
 
Benefits of Big Data in Health Care A Revolution
ijtsrd
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
Ashish Salve
 
Data science in healthcare-Assignment 2.pptx
ArpitaDebnath20
 
Clustering algorithms for analysing electronic medical record: A mapping study
IAESIJAI
 
Paper id 36201506
IJRAT
 
50120140506011
IAEME Publication
 
Data Science in Healthcare
ijtsrd
 
Medicine as a data science
improvemed
 
Data science in healthcare-Assignment 2.pptx
ArpitaDebnath20
 
An Account of Critical Data Analysis in Medical_2nd.pdf
PrabahaGoswami
 
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
rahulmonikasharma
 
Biomedical Informatics
improvemed
 
IRJET- Review on Knowledge Discovery and Analysis in Healthcare using Dat...
IRJET Journal
 
Medicine as data science
improvemed
 
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
1-s2.0-S0167923620300944-main.pdf
Deenadayalan Thanigaimalai
 

Recently uploaded (20)

PPTX
FSSAI (Food Safety and Standards Authority of India) & FDA (Food and Drug Adm...
Dr. Paindla Jyothirmai
 
PDF
Sunset Boulevard Student Revision Booklet
jpinnuck
 
PDF
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
PPT
Python Programming Unit II Control Statements.ppt
CUO VEERANAN VEERANAN
 
PDF
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
PPTX
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
PPTX
Software Engineering BSC DS UNIT 1 .pptx
Dr. Pallawi Bulakh
 
PDF
UTS Health Student Promotional Representative_Position Description.pdf
Faculty of Health, University of Technology Sydney
 
PDF
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
PPTX
CDH. pptx
AneetaSharma15
 
PPTX
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
PPTX
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
Odoo 18 Sales_ Managing Quotation Validity
Celine George
 
PPTX
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
DOCX
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
PDF
1.Natural-Resources-and-Their-Use.ppt pdf /8th class social science Exploring...
Sandeep Swamy
 
PDF
Phylum Arthropoda: Characteristics and Classification, Entomology Lecture
Miraj Khan
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
PDF
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
PPTX
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
FSSAI (Food Safety and Standards Authority of India) & FDA (Food and Drug Adm...
Dr. Paindla Jyothirmai
 
Sunset Boulevard Student Revision Booklet
jpinnuck
 
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
Python Programming Unit II Control Statements.ppt
CUO VEERANAN VEERANAN
 
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
Software Engineering BSC DS UNIT 1 .pptx
Dr. Pallawi Bulakh
 
UTS Health Student Promotional Representative_Position Description.pdf
Faculty of Health, University of Technology Sydney
 
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
CDH. pptx
AneetaSharma15
 
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
Odoo 18 Sales_ Managing Quotation Validity
Celine George
 
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
1.Natural-Resources-and-Their-Use.ppt pdf /8th class social science Exploring...
Sandeep Swamy
 
Phylum Arthropoda: Characteristics and Classification, Entomology Lecture
Miraj Khan
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 

Heart Diseases Diagnosis Using Data Mining Techniques

  • 1. ISSN 2350-1049 International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS) Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org Page | 15 Paper Publications Heart Diseases Diagnosis Using Data Mining Techniques 1 Dr. R. Muralidharan, 2 D. Vinodhini 1 M.Sc. M.Phil. MCA., Ph.D. Vice Principal & HOD in CS Rathinam College of Arts & Science Coimbatore, India 2 M.Sc., Department of Computer Science, Rathinam College of Arts & Science, Coimbatore, India Abstract: In health care domain, data mining plays a vital role for predicting diseases. For detecting a disease number of tests should be required from the patient but number of test should be reduced while using data mining techniques. The data mining technique analyze the test parameters and it concludes the associative relation between the parameters that reduces the tests and the reduced test plays key role in time and performance. In this study medical terms such as sex, blood pressure, and cholesterol like nineteen input attributes are used. In this paper association among various attributes which are the causative factors of heart diseases are analyzed. The patient’s records are observed before prediction and the factors are grouped as per its severity level. In this system the level of causative factors are categorized using K-Means clustering technique and it distinguishes the risky and non-risky factors. Frequent risk factors are mined from the clinical heart database using Apriori algorithm. The risk factors are taken for this study to predict the risk level and find the co-ordination among the factors that helps the medical people to predict the disease with minimum tests and treatments. Keywords: Associative Mining algorithm, Data mining, Heart disease, K-Means Clustering. I. INTRODUCTION Data mining is the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data Mining is a very crucial research domain in recent research world. The techniques are useful to elicit significant and utilizable knowledge which can be perceived by many individuals. Data mining programs consists of diverse methodologies which are predominantly produced and used by commercial enterprises and biomedical researchers. These techniques are well disposed towards their respective knowledge domain. The heart is very important part of human body, which pumps blood into the entire body. If circulation of blood in body is inefficient the organs like brain suffer and if heart stops working altogether, death occurs within minutes. Life is completely dependent on efficient working of the heart. The term Heart disease refers to disease of heart & blood vessel system within it. This study is closely related to heart diseases prediction and it provides an awareness to the patient to prevent them from unexpected heart problem. This paper analyzes the heart disease predictions using classification algorithms. Medicinal data mining has high potential for exploring the unknown patterns in the data sets of medical domain. These patterns can be used for medical analysis in raw medical data. Heart disease was the major cause of casualties in the world. Half of the deaths occur in the countries like India, United States are due to cardiovascular diseases. Medical data mining techniques like Association Rule Mining, Clustering, and Classification Algorithms are applied to analyze the different kinds of heart based problems. Data mining is a process of extracting/equating the meaningful data from the large database by using different tools and techniques. Data mining (sometimes called information or knowledge discovery) is the process of evaluating data from different outlooks and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of diagnostic tools for analyzing data. It allows users to investigate data from many different dimensions or angles, categorize it, and summarize the relationships identified.
  • 2. ISSN 2350-1049 International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS) Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org Page | 16 Paper Publications Technically, data mining is the process of finding correlations or forms among dozens of fields in large relational databases. Data mining or knowledge discovery in databases (KDD) is an interdisciplinary field where we integrate techniques from different fields including data base systems, statistics, mathematics, high performance computing, artificial intelligence and machine learning. Fig. 1 Basic idea about KDD This data mining techniques are used in health care domain for prediction of problem, disease detection, optimizing the solution and so on. Recent technologies are nowadays able to provide a lot of information on health-related activities, which can then be analyzed in order to find important information and to collect relevant information. This data mining techniques are used for disease detection, pattern recognition by using multiple application. Data mining is about to identify the similarities between searching the valuable business information from the large database systems such as finding linked products in gigabytes of store scanner data or the mining a mountain for a vein of valuable dataset. Both kind of processes required either shifting through an immense amount of material, or to perform the search intelligently so that exactly match will be performed. Data mining can be done on a database whose size and quality are sufficient. Data mining techniques are often used to study health factors. The purpose of the study is to analyze the human health using classification techniques and predict the risk level of heart diseases based on the health factors. The study applied data mining methods like k-means and associative mining facilitate an improvement in the interpretation of the clinical data set. II. REVIEW OF LITERATURE The paper titled “The Survey of Data Mining Applications and Feature Scope” - focused a variety of techniques, approaches and different areas of the research which are helpful and marked as the important field of data mining Technologies. As they are aware that many MNC‟s and large organizations are operated in different places of the different countries. Each place of operation may generate large volumes of data. Corporate decision makers require access from all such sources and take strategic decisions. The data warehouse is used in the significant business value by improving the effectiveness of managerial decision-making. In an uncertain and highly competitive business environment, the value of strategic information systems such as these are easily recognized however in today‟s business environment, efficiency or speed is not the only key for competitiveness. These types of huge amount of data‟s are available in the form of tera- to peta-bytes which has drastically changed in the areas of science and engineering. To analyze, manage and make a decision of such type of huge amount of data we need techniques called the data mining which will transforming in many fields. They imparts more number of applications of data mining and also focuses scope of the data mining which will helpful in the further research.
  • 3. ISSN 2350-1049 International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS) Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org Page | 17 Paper Publications “A Literature Review of Data Mining Techniques Used in Healthcare Databases” - presented an overview of the current research being carried out using the data mining techniques for the diagnosis and prognosis of various diseases. The goal of this study is to identify the well-performing data mining algorithms used on medical databases. The following algorithms have been identified: Decision Trees, Support Vector Machine, Artificial neural networks and their Multilayer Perceptron model, Naive Bayes, Apriori, Fuzzy Rules. Analyses show that it is very difficult to name a single data mining algorithm as the most suitable for the diagnosis and/or prognosis of diseases. At times some algorithms perform better than others, but there are cases when a combination of the best properties of some of the aforementioned algorithms together results more effective. “An Empirical study on prediction of Heart disease using Classification Data mining techniques” - the use of pattern recognition and data mining techniques into risk prediction models in the clinical domain of cardiovascular medicine is proposed. The data is to be modelled and classified by using classification data mining technique. Some of the limitations of the conventional medical scoring systems are that there is a presence of intrinsic linear combinations of variables in the input set and hence they are not adept at modelling nonlinear complex interactions in medical domains. This limitation is handled in this research by use of classification models which can implicitly detect complex nonlinear relationships between dependent and independent variables as well as the ability to detect all possible interactions between predictor variables. “Analysis of Attribute Association in Heart Disease Using Data Mining Techniques” - In data mining association rule mining represents a promising technique to find hidden patterns in large data bases. The main issue about mining association rules in a medical data is the large number of rules that are discovered, most of which are irrelevant. A rule- based decision support system (DSS) is presented for the diagnosis of coronary vascular disease (CVD). The dataset used for the DSS generation and evaluation consists of 1897 subjects, each one characterized by 21 features, including demographic and history data, as well as laboratory examinations. Such number of rules makes the search slow. However, not all of the generated rules are interesting, and some rules may be ignored. In medical terms, association rules relate disease data measures the patient risk factors and occurrence of the disease. Association rule medical significance is evaluated with the usual support and confidence metrics. Association rules are compared to predictive rules mined with decision trees, a well-known machine learning technique. In this paper [12] they proposed a new system to find the strength of association among the attributes of a given data set. “A Fuzzification Approach for Prediction of Heart Disease” - Data Mining operations and approaches are the improvement over the statistical methods that enables a user to perform the future analysis based on current dataset. One of such analysis provided by data mining approaches is the predication based analysis. In their work [13], the heart disease prediction system is designed. The heart disease prediction is actually an expert system application which requires the authenticated dataset to process. A Fuzzy based soft computing approach is been implemented on multiple parameters to predict the heart disease. In this paper [13], the earlier work done in the area of medical disease prediction is studied as well a new fuzzy rule based approach is suggested to perform the heart disease prediction. “Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction” - The healthcare environment is still “information rich but knowledge poor”. There is a wealth of data available within the healthcare systems. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. This research paper [14] intends to provide a survey of current techniques of knowledge discovery in databases using data mining techniques that are in use in today‟s medical research particularly in Heart Disease Prediction. Number of experiment has been conducted to compare the performance of predictive data mining technique on the same dataset and the outcome reveals that Decision Tree outperforms and some time Bayesian classification is having similar accuracy as of decision tree but other predictive methods like Neural Networks, Classification based on clustering are not performing well. The second conclusion is that the accuracy of the Decision Tree and Bayesian Classification further improves after applying genetic algorithm to reduce the actual data size to get the optimal subset of attribute sufficient for heart disease prediction. A Heart Disease Prediction Model using Decision Tree - In this paper [15], they developed a heart disease prediction model that can assist medical professionals in predicting heart disease status based on the clinical data of patients. Firstly, they select 14 important clinical features, i.e., age, sex, chest pain type, Blood Pressure, cholesterol, fasting blood sugar, resting ECG, max heart rate, exercise induced angina, old peak, slope, number of vessels colored and diagnosis of heart
  • 4. ISSN 2350-1049 International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS) Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org Page | 18 Paper Publications disease. Secondly, they developed a prediction model using J48 decision tree for classifying heart disease based on these clinical features against un-pruned, pruned and pruned with reduced error pruning approach. Finally, the accuracy of Pruned J48 Decision Tree with Reduced Error Pruning Approach is more better then the simple Pruned and Un-pruned approach. Their result obtained that which shows that fasting blood sugar is the most important attribute which gives better classification against the other attributes but its gives not better accuracy. III. RESEARCH METHODOLOGIES In this work, an architecture data mining technique based heart disease prediction system combining the prediction system with mining technology was used. In this model we have used one of the classification algorithms called association mining. Simulated heart disease dataset containing 1000 patient records are used for this research work. This dataset contains 19 symptoms as shown in Table 1. They are the symptoms of various heart diseases. TABLE I. Heart Disease Attributes for the Study S. No Attributes Values 1 Male and Female Age < 30 , Age >30 to <50, Age>50 and Age <70, Age>70. 2 Smoking Never , Past, Current 3 Overweight Yes, No 4 Alcohol Intake Never , Past, Current 5 High salt diet Yes, No 6 High saturated fat diet Yes, No 7 Exercise Never, Regular 8 Sedentary Lifestyle/inactivity Yes, No 9 Hereditary Yes, No 10 Bad cholesterol Very High >200, High 160 to 200, Normal <160 11 Blood Pressure Normal (130/89), Low (< 119/79) High (>200/160) 12 Blood sugar High(>120&<400), Normal(>90&<120), Low ( <90) 13 Heart Rate Low (< 60bpm), Normal (60 to 100) High (>100bpm) 14 Defect type Normal, Fixed, Reversible defect 15 Chest pain type typical type 1, typical type angina non-angina pain, asymptomatic 16 Resting electrographic results Normal having ST_T wave abnormal left ventricular hypertrophy 17 number of major vessels colored by fluoroscopy 0-3 values 18 Dry or persistent Cough Yes, No 19 Skin rashes or Unusual Spots Yes, No The data and item sets that occur frequently in the data base are known as frequent patterns. The frequent patterns that is most significantly related to specific heart disease types and are helpful in predicting the disease and its type is known as Significant frequent pattern. It has four levels of risk like low level, intermediate level, high level and very high level. Based on the predicted risk values the range of risk will be assigned. In this study every individual health factors which are taken for the study is analyzed and the causative factors are classified using data mining technique.
  • 5. ISSN 2350-1049 International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS) Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org Page | 19 Paper Publications Association Rules: Association rules are if / then statement that help uncover relationships between seemingly unrelated data in a relational database or other information repository. An association rule has two parts an antecedent (if) and a consequent (then). An antecedent is an item found the data and consequence t is an item that is found in combination with the antecedents. Association rules are created y analyzing data for frequent if / then patterns and using the criteria support and confidence to identify the most important relationships. They are divided into separate categories in the data mining and used in the Weka to perform the operations. Apriori Algorithm: The Apriori algorithm is a popular and foundational member for correlation based „Data Mining Kernels‟ used today. It is used to process the data into more useful forms, in particular connection between set of items. Dataset: The clinical data sets are gathered from various health care centres. From the collected records data preprocessing is applied to carry out the factor which are relevant for this study. Nearly 1000 health records are gathered and the data are pre-processed. The following table shows the sample pre-processed dataset. TABLE II. Sample Data Table Attributes Values Values Values Gender Male Male Female Age 63 68 45 Smoking Yes No No Weight (kgs.) 70 56 65 Alcohol Intake Yes No No High salt diet Yes No Yes High saturated fat diet No Yes No Exercise No Yes No Sedentary Lifestyle/inactivity Yes No Yes Hereditary No Yes No Bad cholesterol 170 160 180 Blood Pressure 130/89 160/100 140/90 Blood sugar 100 120 110 Heart Rate (bpm) 70 80 85 Defect type Normal Fixed Normal Chest pain type typical type 1 No typical type 1 Resting electrographic results Normal Normal Abnormal Number of major vessels colored by fluoroscopy 0 0 0 Dry or persistent Cough Yes No No Skin rashes or Unusual Spots No No No The above table shows sample dataset. First the data set is converted as binary data set and it is analyzed using Weka tool. This binary data set illustrates patients‟ health information in binary mode. Collections of Patient medical details used for transaction databases and set of associations can be represented as binary incidence matrices with columns corresponding to the factors and rows corresponding to the Patients. The matrix entries represent presence (1) or absence (0) of a risk factor in a particular patient.
  • 6. ISSN 2350-1049 International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS) Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org Page | 20 Paper Publications TABLE III. Binary Data Set Attributes Values Values Values Age 1 1 0 Smoking 1 0 0 Weight (kgs.) 1 0 1 Alcohol Intake 1 0 0 High salt diet 1 0 1 High saturated fat diet 0 1 0 Exercise 0 1 0 Sedentary Lifestyle/inactivity 1 0 1 Hereditary 0 1 0 Bad cholesterol 1 0 1 Blood Pressure 0 1 0 Blood sugar 0 0 0 Heart Rate (bpm) 0 0 0 Defect type 0 1 0 Chest pain type 1 0 1 Resting electrographic results 0 0 1 number of major vessels colored by fluoroscopy 0 0 0 Dry or persistent Cough 1 0 0 Skin rashes or Unusual Spots 0 0 0 Data Preprocess: Fig.3.1 Data Preprocess The Fig. 3.1 shows the attributes which are involving in the study. There are nineteen factors are taken for this analysis. The attributes are associated with one and another. This analysis find the support and confidence level of heart diseases according various factors.
  • 7. ISSN 2350-1049 International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS) Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org Page | 21 Paper Publications IV. OBSERVATIONS AND RESULTS When the apriori algorithm was implemented on the healthcare.arff file, it produced best association rules for that healthcare.arff or dataset. Now the observation took place, to observe the result of the same operation on the same dataset by the use of weka tool. Fig.3.2 (a) Apriori Result Fig.3.2 (b) Apriori Result For a given data set, figure 3.2 (a & b) shows Apriori algorithm mined risk factors that have minimum support of all transactions and it provides 10 best association rules.
  • 8. ISSN 2350-1049 International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS) Vol. 3, Issue 4, pp: (15-22), Month: October - December 2016, Available at: www.paperpublications.org Page | 22 Paper Publications V. CONCLUSION AND FUTURE ENHANCEMENT Healthcare data set includes the patients‟ health information. The information is classified as hereditary, habitual and medical test details. In which the seventeen causative factors which are gathered from domain experts are obtained for the study. Factors are analyzed and the co-occurrence of the factors is found by generating candidate item set and pruning. Finally the supportive measure of the combined factors is obtained through Apriori. For a given patient records risk and non-risk factors are grouped using K-means clustering. The ranges of the causative factors are analyzed to predict the risk level of the heart disease. It is no doubt that Apriori algorithm successfully finds the frequent elements from the database. In future the work can be expanded and enhanced for the automation of various types of disease prediction. It also extended to find various types of diseases with the use of these factors. REFERENCES [1] Neelamadhab Padhy, Dr. Pragnyaban Mishra , and Rasmita Panigrahi, “The Survey of Data Mining Applications And Feature Scope”, International Journal of Computer Science, Engineering and Information Technology. [2] Elma Kolçe, Neki Frasheri, “A Literature Review of Data Mining Techniques Used in Healthcare Databases”, ICT Innovations 2012 Web Proceedings - Poster Session ISSN. [3] T.John Peter., “An Empirical study on prediction of Heart disease using Classification Data mining techniques” IEEE-International Conference On Advances In Engineering, Science And Management. [4] K.Srinivas., G.Raghavendra Rao and A.Govardhan., “Analysis of Attribute Association in Heart Disease Using Data Mining Techniques”, International Journal of Engineering Research and Applications. [5] Nitika, Madan Lal Yadav, “A Fuzzification Approach for Prediction of Heart Disease”, International Journal of Engineering Trends and Technology (IJETT). [6] Jyoti Soni Ujma Ansari Dipesh Sharma., “Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction”, International Journal of Computer Applications (IJCA). [7] Atul Kumar Pandey, Prabhat Pandey, K.L., Jaiswal, Ashish Kumar Sen, “A Heart Disease Prediction Model using Decision Tree”, IOSR Journal of Computer Engineering.