International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6937
Disease Prediction Using Machine Learning
Akash C. Jamgade, Prof. S. D. Zade
Student, Dept. of Computer Science and Engineering, Priyadarshini Institute of Engineering & Technology,
Nagpur, Maharashtra, India
Professor, Dept. of Computer Science and Engineering, Priyadarshini Institute of Engineering & Technology,
Nagpur, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - One such implementation of machine learning
algorithms is in the field of healthcare. Medical facilities need
to be advanced so that better decisions for patient diagnosis
and treatment options can be made. Machine learning in
healthcare aids the humans to process huge and complex
medical datasets and then analyze them into clinical insights.
This then can further be used by physicians in providing
medical care. Hence machine learning when implemented in
healthcare can leads to increased patient satisfaction. In this
paper, we try to implement functionalitiesof machinelearning
in healthcare in a single system. Instead of diagnosis, when a
disease prediction is implemented using certain machine
learning predictive algorithms then healthcare can be made
smart. Some cases can occur when early diagnosis of a disease
is not within reach. Hence disease prediction can beeffectively
implemented. As widely said “Prevention is better than cure”,
prediction of diseases and epidemic outbreakwouldleadto an
early prevention of an occurrence of a disease. This paper
mainly focus on the development of a system or we could say
an immediate medical provision which would incorporate the
symptoms collected from multisensory devices and other
medical data and store them into a healthcare dataset. This
dataset would then be analyzed using K-mean machine
learning algorithms to deliver results with maximum
accuracy.
Key Words: Big Data, healthcare, Machine learning, K-mean
algorithm, etc.
1. INTRODUCTION
Disease prediction using patient treatment history and
health data by applying data mining and machine learning
techniques is ongoing struggle for the past decades. Many
works have been applied data mining techniques to
pathological data or medical profiles for prediction of
specific diseases. These approaches tried to predict the
reoccurrence of disease. Also, some approaches try to do
prediction on control and progression of disease. Therecent
success of deep learning in disparate areas of machine
learning has driven a shift towards machinelearningmodels
that can learn rich, hierarchical representations of raw data
with little pre processing and produce moreaccurateresults.
With the development of big data technology,moreattention
has been paid to disease prediction from the perspective of
big data analysis; various researches have been conducted
by selecting the characteristics automatically from a large
number of data to improve the accuracy of risk classification
rather than the previously selected characteristics.
The main focus is on to use machinelearninginhealthcare to
supplement patient care for better results. Machinelearning
has made easier to identify different diseases and diagnosis
correctly. Predictive analysis with the help of efficient
multiple machine learning algorithms helps to predict the
disease more correctly and help treat patients.
The healthcare industry produces large amounts of health-
care data daily that can be used to extract information for
predicting disease that can happen to a patient in future
while using the treatment history and health data. This
hidden information in the healthcare data will be later used
for affective decision making for patient’s health. Also, this
areas need improvement by using the informative data in
healthcare.
One such implementation of machine learning algorithms is
in the field of healthcare. Medical facilities need to be
advanced so that better decisions for patient diagnosis and
treatment options can be made. Machine learning in
healthcare aids the humans to process huge and complex
medical datasets and then analyzethemintoclinical insights.
This then can further be used by physicians in providing
medical care. Hence machine learning whenimplemented in
healthcare can leads to increased patient satisfaction.The k-
mean algorithm is used to predict diseases using patient
treatment history and health data.
2. EXISTING SYSTEM
Prediction using traditional disease risk model usually
involves a machine learning and supervised learning
algorithm which uses training data with the labels for the
training of the models. High-risk and Low-risk patient
classification is done in groups test sets. But these models
are only valuable in clinical situations and are widely
studied. A system for sustainable health monitoring using
smart clothing by Chen et.al. He thoroughly studied
heterogeneous systems and was able to achieve the best
results for cost minimization on the tree and simple path
cases for heterogeneous systems.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6938
The information of patient’s statistics, test results, and
disease history is recorded in EHR which enables to identify
potential data-centric solutions which reduce the cost of
medical case studies. Bates et al. propose six applications of
big data in the healthcare field. Existing systems can predict
the diseases but not the subtype of diseases.Itfailstopredict
the condition of people.
The predictions of diseases have been non-specific and
indefinite
3. PROPOSED SYSTEM
In this paper, we have combined the structure and
unstructured data in healthcare fields that let us assess the
risk of disease. The approach of the latent factor model for
reconstructing the missingdata inmedical recordswhich are
collected from the hospital. And by using statistical
knowledge, we could determine the major chronic diseases
in a particular region and in particular community. To
handle structured data, we consult hospital experts to know
useful features.
In the case of unstructured text data, we select the features
automatically with the help of k-mean algorithm. We
propose a k-mean algorithm for both structured and
unstructured data.
3.1 The k-means algorithm
The k-means algorithm is a simple iterative method to
partition a given dataset into a specified number of clusters,
k. This algorithm has beendiscoveredbyseveral researchers
across different disciplines. The algorithm operates on a set
of d-dimensional vectors, D = {xi | i = 1, . . . , N}, where xi ∈ Rd
denotes the ith data point. The algorithm is initialized by
picking k points in Rd as the initial k cluster. Techniques for
selecting these initial seedsincludesamplingatrandomfrom
the dataset, setting them as the solution of clustering a small
subset of the data or perturbing theglobal meanofthedatak
times.
4. SYSTEM ARCHITECTURE
Fig -1: System Architecture
5. CONCLUSION
With the proposed system, higher accuracy can be achieved.
We not only use structured data, but also the text data of the
patient based on the proposed k-mean algorithm. To find
that out, we combine both data, and the accuracy rate canbe
reached up to 95%. None of the existing system and work is
focused on using both the data types in the field of medical
big data analytics. We propose a K-Mean clustering
algorithm for both structured and unstructured data. The
disease risk model is obtained by combiningbothstructured
and unstructured features.
ACKNOWLEDGEMENT
I express my sincere gratitude towards my guide of Prof. S.
D. Zade for their constant help, encouragement and
inspiration throughout the project work. Also I wouldlike to
thank the Head of Computer Science and Engineering
Department Dr. P. S. Prasad for him valuable guidance ,
ability to motive me and even willingness to solve difficulty
made it possible to make my project unique and made task
easier. My sincere thanks to Principal, Dr. V. M. Nanoti for
providing me necessary facility to carry out the work.
REFERENCES
[1] D. W. Bates, S. Saria, L. Ohno-Machado, A. Shah, and G.
Escobar, “Big data in health care: using analytics to
identify and manage high-risk and high-cost patients,”
Health Affairs, vol. 33, no. 7, pp. 1123–1131, 2014.
[2] K.R.Lakshmi, Y.Nagesh and M.VeeraKrishna,
”Performance comparison of three data mining
techniques for predicting kidney disease survivability”,
International Journal of Advances in Engineering &
Technology, Mar. 2014.
[3] Mr. Chala Beyene, Prof. Pooja Kamat, “Survey on
Prediction and Analysis the OccurrenceofHeartDisease
Using Data Mining Techniques”, International Journal of
Pure and Applied Mathematics, 2018.
[4] Boshra Brahmi, Mirsaeid Hosseini Shirvani, “Prediction
and Diagnosis of Heart Disease by Data Mining
Techniques”, Journals of Multidisciplinary Engineering
Science and Technology, vol.2,2February2015,pp.164-
168.
[5] A. Singh, G. Nadkarni, O. Gottesman, S. B. Ellis, E. P.
Bottinger, and J. V. Guttag, “Incorporating temporal ehr
data in predictive models for risk stratification of renal
function deterioration,” Journal of biomedical
informatics, vol. 53, pp. 220–228, 2015.
[6] S. Patel and H. Patel, “Survey of data mining techniques
used in healthcare domain,” Int. J. of Inform. Sci. and
Tech., Vol. 6, pp. 53-60,March 2016.

More Related Content

PDF
IRJET- Disease Prediction using Machine Learning
PPTX
Disease Prediction And Doctor Appointment system
PDF
Machine Learning in Healthcare and Life Science
PPTX
Machine Learning for Disease Prediction
PPTX
Disease Prediction by Machine Learning Over Big Data From Healthcare Communities
PPTX
DISEASE PREDICTION SYSTEM USING DATA MINING
PPTX
Facial expression recognition
PPTX
Disease prediction using machine learning
IRJET- Disease Prediction using Machine Learning
Disease Prediction And Doctor Appointment system
Machine Learning in Healthcare and Life Science
Machine Learning for Disease Prediction
Disease Prediction by Machine Learning Over Big Data From Healthcare Communities
DISEASE PREDICTION SYSTEM USING DATA MINING
Facial expression recognition
Disease prediction using machine learning

What's hot (20)

PPTX
Prospects of Deep Learning in Medical Imaging
PDF
Support vector-machines-presentation
PDF
Multiple Disease Prediction System
PPTX
Machine Learning in Healthcare Diagnostics
PDF
Movie recommendation project
PDF
Android Based Questionnaires Application for Heart Disease Prediction System
PDF
A survey of deep learning approaches to medical applications
PPTX
Emotion recognition using facial expressions and speech
PPTX
Movie recommendation Engine using Artificial Intelligence
PPT
Face recognition ppt
PPTX
Driver Drowsiness Detection Review
PPTX
Seminar on detecting fake accounts in social media using machine learning
PDF
Lung Cancer Detection Using Convolutional Neural Network
PDF
(2017/06)Practical points of deep learning for medical imaging
PPTX
Stock Market Prediction using Machine Learning
PDF
Face detection and recognition
PDF
Intro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AI
PPTX
Face Recognition Home Security System(Slide)
PDF
Multiple disease prediction using Machine Learning Algorithms
PPTX
HUMAN FACE IDENTIFICATION
Prospects of Deep Learning in Medical Imaging
Support vector-machines-presentation
Multiple Disease Prediction System
Machine Learning in Healthcare Diagnostics
Movie recommendation project
Android Based Questionnaires Application for Heart Disease Prediction System
A survey of deep learning approaches to medical applications
Emotion recognition using facial expressions and speech
Movie recommendation Engine using Artificial Intelligence
Face recognition ppt
Driver Drowsiness Detection Review
Seminar on detecting fake accounts in social media using machine learning
Lung Cancer Detection Using Convolutional Neural Network
(2017/06)Practical points of deep learning for medical imaging
Stock Market Prediction using Machine Learning
Face detection and recognition
Intro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AI
Face Recognition Home Security System(Slide)
Multiple disease prediction using Machine Learning Algorithms
HUMAN FACE IDENTIFICATION
Ad

Similar to IRJET- Disease Prediction using Machine Learning (20)

PDF
A comprehensive study on disease risk predictions in machine learning
PDF
Multi Disease Detection using Deep Learning
PDF
HEALTH PREDICTION ANALYSIS USING DATA MINING
PDF
Predicting disease from several symptoms using machine learning approach.
PDF
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
PDF
vaagdevi paper.pdf
PDF
Heart Disease Prediction Using Data Mining
PDF
IRJET - E-Health Chain and Anticipation of Future Disease
PDF
Heart Disease Prediction using Machine Learning
PDF
IRJET-Survey on Data Mining Techniques for Disease Prediction
PDF
IRJET- Analyse Big Data Electronic Health Records Database using Hadoop Cluster
PDF
IRJET- Cancer Disease Prediction using Machine Learning over Big Data
PDF
Predictions And Analytics In Healthcare: Advancements In Machine Learning
PDF
IRJET - Review on Classi?cation and Prediction of Dengue and Malaria Dise...
PDF
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
PDF
Health Care Application using Machine Learning and Deep Learning
PDF
Ijarcet vol-2-issue-4-1393-1397
PDF
IRJET - Prediction and Analysis of Multiple Diseases using Machine Learni...
PDF
Analysis on Data Mining Techniques for Heart Disease Dataset
PDF
IRJET - Digital Assistance: A New Impulse on Stroke Patient Health Care using...
A comprehensive study on disease risk predictions in machine learning
Multi Disease Detection using Deep Learning
HEALTH PREDICTION ANALYSIS USING DATA MINING
Predicting disease from several symptoms using machine learning approach.
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
vaagdevi paper.pdf
Heart Disease Prediction Using Data Mining
IRJET - E-Health Chain and Anticipation of Future Disease
Heart Disease Prediction using Machine Learning
IRJET-Survey on Data Mining Techniques for Disease Prediction
IRJET- Analyse Big Data Electronic Health Records Database using Hadoop Cluster
IRJET- Cancer Disease Prediction using Machine Learning over Big Data
Predictions And Analytics In Healthcare: Advancements In Machine Learning
IRJET - Review on Classi?cation and Prediction of Dengue and Malaria Dise...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
Health Care Application using Machine Learning and Deep Learning
Ijarcet vol-2-issue-4-1393-1397
IRJET - Prediction and Analysis of Multiple Diseases using Machine Learni...
Analysis on Data Mining Techniques for Heart Disease Dataset
IRJET - Digital Assistance: A New Impulse on Stroke Patient Health Care using...
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PPTX
Environmental studies, Moudle 3-Environmental Pollution.pptx
PPTX
Petroleum Refining & Petrochemicals.pptx
PPT
Programmable Logic Controller PLC and Industrial Automation
PDF
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
PDF
MLpara ingenieira CIVIL, meca Y AMBIENTAL
PPTX
CN_Unite_1 AI&DS ENGGERING SPPU PUNE UNIVERSITY
PPTX
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
PDF
Present and Future of Systems Engineering: Air Combat Systems
DOCX
ENVIRONMENTAL PROTECTION AND MANAGEMENT (18CVL756)
PPTX
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
PDF
electrical machines course file-anna university
PPTX
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
PPTX
Module 8- Technological and Communication Skills.pptx
PDF
August -2025_Top10 Read_Articles_ijait.pdf
PPT
Chapter 1 - Introduction to Manufacturing Technology_2.ppt
PDF
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
PDF
UEFA_Embodied_Carbon_Emissions_Football_Infrastructure.pdf
PDF
UEFA_Carbon_Footprint_Calculator_Methology_2.0.pdf
PDF
Computer System Architecture 3rd Edition-M Morris Mano.pdf
PPTX
Micro1New.ppt.pptx the mai themes of micfrobiology
Environmental studies, Moudle 3-Environmental Pollution.pptx
Petroleum Refining & Petrochemicals.pptx
Programmable Logic Controller PLC and Industrial Automation
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
MLpara ingenieira CIVIL, meca Y AMBIENTAL
CN_Unite_1 AI&DS ENGGERING SPPU PUNE UNIVERSITY
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
Present and Future of Systems Engineering: Air Combat Systems
ENVIRONMENTAL PROTECTION AND MANAGEMENT (18CVL756)
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
electrical machines course file-anna university
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
Module 8- Technological and Communication Skills.pptx
August -2025_Top10 Read_Articles_ijait.pdf
Chapter 1 - Introduction to Manufacturing Technology_2.ppt
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
UEFA_Embodied_Carbon_Emissions_Football_Infrastructure.pdf
UEFA_Carbon_Footprint_Calculator_Methology_2.0.pdf
Computer System Architecture 3rd Edition-M Morris Mano.pdf
Micro1New.ppt.pptx the mai themes of micfrobiology

IRJET- Disease Prediction using Machine Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6937 Disease Prediction Using Machine Learning Akash C. Jamgade, Prof. S. D. Zade Student, Dept. of Computer Science and Engineering, Priyadarshini Institute of Engineering & Technology, Nagpur, Maharashtra, India Professor, Dept. of Computer Science and Engineering, Priyadarshini Institute of Engineering & Technology, Nagpur, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - One such implementation of machine learning algorithms is in the field of healthcare. Medical facilities need to be advanced so that better decisions for patient diagnosis and treatment options can be made. Machine learning in healthcare aids the humans to process huge and complex medical datasets and then analyze them into clinical insights. This then can further be used by physicians in providing medical care. Hence machine learning when implemented in healthcare can leads to increased patient satisfaction. In this paper, we try to implement functionalitiesof machinelearning in healthcare in a single system. Instead of diagnosis, when a disease prediction is implemented using certain machine learning predictive algorithms then healthcare can be made smart. Some cases can occur when early diagnosis of a disease is not within reach. Hence disease prediction can beeffectively implemented. As widely said “Prevention is better than cure”, prediction of diseases and epidemic outbreakwouldleadto an early prevention of an occurrence of a disease. This paper mainly focus on the development of a system or we could say an immediate medical provision which would incorporate the symptoms collected from multisensory devices and other medical data and store them into a healthcare dataset. This dataset would then be analyzed using K-mean machine learning algorithms to deliver results with maximum accuracy. Key Words: Big Data, healthcare, Machine learning, K-mean algorithm, etc. 1. INTRODUCTION Disease prediction using patient treatment history and health data by applying data mining and machine learning techniques is ongoing struggle for the past decades. Many works have been applied data mining techniques to pathological data or medical profiles for prediction of specific diseases. These approaches tried to predict the reoccurrence of disease. Also, some approaches try to do prediction on control and progression of disease. Therecent success of deep learning in disparate areas of machine learning has driven a shift towards machinelearningmodels that can learn rich, hierarchical representations of raw data with little pre processing and produce moreaccurateresults. With the development of big data technology,moreattention has been paid to disease prediction from the perspective of big data analysis; various researches have been conducted by selecting the characteristics automatically from a large number of data to improve the accuracy of risk classification rather than the previously selected characteristics. The main focus is on to use machinelearninginhealthcare to supplement patient care for better results. Machinelearning has made easier to identify different diseases and diagnosis correctly. Predictive analysis with the help of efficient multiple machine learning algorithms helps to predict the disease more correctly and help treat patients. The healthcare industry produces large amounts of health- care data daily that can be used to extract information for predicting disease that can happen to a patient in future while using the treatment history and health data. This hidden information in the healthcare data will be later used for affective decision making for patient’s health. Also, this areas need improvement by using the informative data in healthcare. One such implementation of machine learning algorithms is in the field of healthcare. Medical facilities need to be advanced so that better decisions for patient diagnosis and treatment options can be made. Machine learning in healthcare aids the humans to process huge and complex medical datasets and then analyzethemintoclinical insights. This then can further be used by physicians in providing medical care. Hence machine learning whenimplemented in healthcare can leads to increased patient satisfaction.The k- mean algorithm is used to predict diseases using patient treatment history and health data. 2. EXISTING SYSTEM Prediction using traditional disease risk model usually involves a machine learning and supervised learning algorithm which uses training data with the labels for the training of the models. High-risk and Low-risk patient classification is done in groups test sets. But these models are only valuable in clinical situations and are widely studied. A system for sustainable health monitoring using smart clothing by Chen et.al. He thoroughly studied heterogeneous systems and was able to achieve the best results for cost minimization on the tree and simple path cases for heterogeneous systems.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6938 The information of patient’s statistics, test results, and disease history is recorded in EHR which enables to identify potential data-centric solutions which reduce the cost of medical case studies. Bates et al. propose six applications of big data in the healthcare field. Existing systems can predict the diseases but not the subtype of diseases.Itfailstopredict the condition of people. The predictions of diseases have been non-specific and indefinite 3. PROPOSED SYSTEM In this paper, we have combined the structure and unstructured data in healthcare fields that let us assess the risk of disease. The approach of the latent factor model for reconstructing the missingdata inmedical recordswhich are collected from the hospital. And by using statistical knowledge, we could determine the major chronic diseases in a particular region and in particular community. To handle structured data, we consult hospital experts to know useful features. In the case of unstructured text data, we select the features automatically with the help of k-mean algorithm. We propose a k-mean algorithm for both structured and unstructured data. 3.1 The k-means algorithm The k-means algorithm is a simple iterative method to partition a given dataset into a specified number of clusters, k. This algorithm has beendiscoveredbyseveral researchers across different disciplines. The algorithm operates on a set of d-dimensional vectors, D = {xi | i = 1, . . . , N}, where xi ∈ Rd denotes the ith data point. The algorithm is initialized by picking k points in Rd as the initial k cluster. Techniques for selecting these initial seedsincludesamplingatrandomfrom the dataset, setting them as the solution of clustering a small subset of the data or perturbing theglobal meanofthedatak times. 4. SYSTEM ARCHITECTURE Fig -1: System Architecture 5. CONCLUSION With the proposed system, higher accuracy can be achieved. We not only use structured data, but also the text data of the patient based on the proposed k-mean algorithm. To find that out, we combine both data, and the accuracy rate canbe reached up to 95%. None of the existing system and work is focused on using both the data types in the field of medical big data analytics. We propose a K-Mean clustering algorithm for both structured and unstructured data. The disease risk model is obtained by combiningbothstructured and unstructured features. ACKNOWLEDGEMENT I express my sincere gratitude towards my guide of Prof. S. D. Zade for their constant help, encouragement and inspiration throughout the project work. Also I wouldlike to thank the Head of Computer Science and Engineering Department Dr. P. S. Prasad for him valuable guidance , ability to motive me and even willingness to solve difficulty made it possible to make my project unique and made task easier. My sincere thanks to Principal, Dr. V. M. Nanoti for providing me necessary facility to carry out the work. REFERENCES [1] D. W. Bates, S. Saria, L. Ohno-Machado, A. Shah, and G. Escobar, “Big data in health care: using analytics to identify and manage high-risk and high-cost patients,” Health Affairs, vol. 33, no. 7, pp. 1123–1131, 2014. [2] K.R.Lakshmi, Y.Nagesh and M.VeeraKrishna, ”Performance comparison of three data mining techniques for predicting kidney disease survivability”, International Journal of Advances in Engineering & Technology, Mar. 2014. [3] Mr. Chala Beyene, Prof. Pooja Kamat, “Survey on Prediction and Analysis the OccurrenceofHeartDisease Using Data Mining Techniques”, International Journal of Pure and Applied Mathematics, 2018. [4] Boshra Brahmi, Mirsaeid Hosseini Shirvani, “Prediction and Diagnosis of Heart Disease by Data Mining Techniques”, Journals of Multidisciplinary Engineering Science and Technology, vol.2,2February2015,pp.164- 168. [5] A. Singh, G. Nadkarni, O. Gottesman, S. B. Ellis, E. P. Bottinger, and J. V. Guttag, “Incorporating temporal ehr data in predictive models for risk stratification of renal function deterioration,” Journal of biomedical informatics, vol. 53, pp. 220–228, 2015. [6] S. Patel and H. Patel, “Survey of data mining techniques used in healthcare domain,” Int. J. of Inform. Sci. and Tech., Vol. 6, pp. 53-60,March 2016.