Disease Prediction of Adiposity Using ML
Disease Prediction of Adiposity Using ML
Bachelor of Technology
in
Computer Science & Engineering
By
May, 2025
BATCH N O :MI1001
ADAPTIVE HEALTH MONITORING AND DISEASE
PREDICTION
Bachelor of Technology
in
Computer Science & Engineering
By
May, 2025
CERTIFICATE
It is certified that the work contained in the project report titled ”ADAPTIVE HEALTH MONI-
TORING AND DISEASE PREDICTION ” by ”G V V SATYANARAYANA (22UECM0086), Y
VENKATA SUDHEER (22UECM0292), M SAI TEJA (22UECM0153)” has been carried out under
my supervision and that this work has not been submitted elsewhere for a degree.
Signature of Supervisor
Dr. R.LOTUS
Associate Professor
Computer Science & Engineering
School of Computing
Vel Tech Rangarajan Dr. Sagunthala R&D
Institute of Science and Technology
May, 2025
i
DECLARATION
We declare that this written submission represents our ideas in our own words and where others’
ideas or words have been included, we have adequately cited and referenced the original sources. We
also declare that we have adhered to all principles of academic honesty and integrity and have not
misrepresented or fabricated or falsified any idea/data/fact/source in our submission. We understand
that any violation of the above will be cause for disciplinary action by the Institute and can also
evoke penal action from the sources which have thus not been properly cited or from whom proper
permission has not been taken when needed.
(Signature)
G V V SATYANARAYANA
Date: / /
(Signature)
Y VENKATA SUDHEER
Date: / /
(Signature)
(M SAI TEJA
Date: / /
ii
APPROVAL SHEET
This project report entitled ADAPTIVE HEALTH MONITORING AND DISEASE PREDICTION
by G V V SATYANARAYANA (22UECM0086), Y VENKATA SUDHEER (22UECM0292), M SAI
TEJA (22UECM0153) is approved for the degree of B.Tech in Computer Science & Engineering.
Examiners Supervisor
Date: / /
Place:
iii
ACKNOWLEDGEMENT
We express our deepest gratitude to our Honorable Founder Chancellor and President Col.
Prof. Dr. R. RANGARAJAN B.E. (Electrical), B.E. (Mechanical), M.S (Automobile), D.Sc., and
Foundress President Dr. R. SAGUNTHALA RANGARAJAN M.B.B.S. Vel Tech Rangarajan Dr.
Sagunthala R&D Institute of Science and Technology, for her blessings.
We express our sincere thanks to our respected Chairperson and Managing Trustee Mrs. RAN-
GARAJAN MAHALAKSHMI KISHORE,B.E., Vel Tech Rangarajan Dr. Sagunthala R&D
Institute of Science and Technology, for her blessings.
We are very much grateful to our beloved Vice Chancellor Prof. Dr.RAJAT GUPTA, for provid-
ing us with an environment to complete our project successfully.
We are thankful to our Professor & Head, Department of Computer Science & Engineering,
Dr. N. VIJAYARAJ, M.E., Ph.D., and Associate Professor & Assistant Head, Department of
Computer Science & Engineering, Dr. M. S. MURALI DHAR, M.E., Ph.D.,for providing im-
mense support in all our endeavors.
We also take this opportunity to express a deep sense of gratitude to our Internal Supervisor Dr.
R.LOTUS, M.Tech., Ph. D., for his/her cordial support, valuable information and guidance, he/she
helped us in completing this project through various stages.
We thank our department faculty, supporting staff and friends for their help and guidance to com-
plete this project.
G V V SATYANARAYANA (22UECM0086)
Y VENKATA SUDHEER (22UECM0292)
M SAI TEJA (22UECM0153)
iv
ABSTRACT
Keywords:
Random Forest, Gradient Boosting, Parkinson’s Disease, Diabetes, Adipos-
ity, Symptom Analyzer, Disease Prediction, Health Assessment, Machine Learn-
ing, Early Detection, Risk Estimation, Medical Interface, Preventive Health-
care, Accessibility, Health Intelligence.
v
LIST OF FIGURES
6.1 Output 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6.2 Output 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.3 Output 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.4 Output 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
vi
LIST OF TABLES
vii
LIST OF ACRONYMS AND
ABBREVIATIONS
viii
TABLE OF CONTENTS
Page.No
ABSTRACT v
LIST OF FIGURES vi
1 INTRODUCTION 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aim of the project . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Project Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Scope of the Project . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 LITERATURE REVIEW 4
2.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Gap Identification . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 PROJECT DESCRIPTION 9
3.1 Existing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 System Specification . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.1 Hardware Specification . . . . . . . . . . . . . . . . . . . . 10
3.3.2 Software Specification . . . . . . . . . . . . . . . . . . . . 10
3.3.3 Standards and Policies . . . . . . . . . . . . . . . . . . . . 10
4 METHODOLOGY 11
4.1 Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 General Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3 Design Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3.1 Data Flow Diagram . . . . . . . . . . . . . . . . . . . . . . 12
4.3.2 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . 13
4.3.3 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3.4 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . 15
4.3.5 Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 Algorithm & Pseudo Code . . . . . . . . . . . . . . . . . . . . . . 17
4.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4.2 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4.3 Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.5 Module Description . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.5.1 Module 1: Diabetes Disease Prediction . . . . . . . . . . . 20
4.5.2 Module 2: Parkinson’s Disease Prediction . . . . . . . . . . 20
4.5.3 Module 3: Adiposity Risk Prediction . . . . . . . . . . . . 21
4.5.4 Module 4: Symptom Analyzer (Random Forest) . . . . . . 21
8 PLAGIARISM REPORT 33
9 Source Code 34
References 36
Chapter 1
INTRODUCTION
1.1 Introduction
The aim of the ML-Based Disease Prediction and Symptom Analyzer project
is to develop an intelligent system that predicts the risk of Parkinson’s disease, Adi-
posity, and Diabetes using Gradient Boosting and Random Forest algorithms. By
leveraging clinical data such as blood pressure, blood sugar, calcium levels, and heart
rate, this system provides accurate health predictions to help individuals assess their
risk of developing these diseases. The goal is to make the technology accessible to
users without requiring medical expertise, enabling them to understand their health
status and take preventive actions early. The system collects data from reliable med-
ical sources, cleans, and normalizes it for consistency before dividing it into training
1
and testing datasets for effective model evaluation. The project aims to optimize the
models for high accuracy in prediction, focusing on metrics such as precision, recall,
and F1-score. Another key objective is to integrate a Symptom Analyzer module that
uses the Random Forest algorithm to map reported symptoms to possible diseases,
offering immediate insights for users. This feature enhances the system’s usability
by allowing individuals to self-assess and detect potential health risks. Ultimately,
the project aims to empower users with early detection tools, encourage preventive
healthcare measures, and provide timely medical insights to reduce risks associated
with chronic diseases
The project domain of the ML-Based Disease Prediction and Symptom Analyzer
lies at the intersection of Machine Learning, Healthcare, and Data Analytics. The
primary objective of this system is to harness the power of advanced machine learn-
ing algorithms, specifically Gradient Boosting and Random Forest, to predict the
risk of developing chronic diseases such as Parkinson’s disease, Adiposity, and Di-
abetes. By analyzing clinical data like blood pressure, blood sugar levels, calcium
concentrations, and heart rate, the system provides users with accurate and reliable
predictions based on data-driven insights. The domain encompasses the collection
and preprocessing of medical data from trusted sources, followed by its segmentation
into training and testing datasets to train and evaluate the machine learning models.
The Symptom Analyzer feature, powered by Random Forest, allows users to input
symptoms and receive immediate feedback on potential conditions. This integra-
tion of symptom analysis with predictive modeling ensures a comprehensive health
assessment, making it an invaluable tool for early disease detection and risk manage-
ment. The system aims to offer a user-friendly interface, enabling both individuals
and healthcare professionals to make informed decisions. The ultimate goal is to
provide accessible healthcare technology that bridges gaps in early diagnosis, pro-
motes preventive care, and empowers users to take proactive steps toward improving
their overall health.
2
1.4 Scope of the Project
The scope of the project focuses on developing a robust ML-Based Disease Pre-
diction System that targets the prediction of Parkinson’s disease, Adiposity, and Di-
abetes using Gradient Boosting and Random Forest algorithms. The system will
process and analyze clinical data such as blood pressure, blood sugar, calcium lev-
els, and heart rate to predict the likelihood of these conditions. The scope extends to
the design of a Symptom Analyzer tool that allows users to input their symptoms and
receive possible disease outcomes based on machine learning-driven insights. This
feature aims to enhance accessibility by offering immediate, data-driven feedback.
The system’s development will involve data collection from trusted medical sources,
preprocessing for consistency, and splitting the dataset into training and testing sets
for model evaluation. The machine learning models will be optimized to improve ac-
curacy and predictive performance based on precision, recall, and F1-score metrics.
The project also covers the integration of a user-friendly interface that allows indi-
viduals to assess their health risks independently. Additionally, the scope includes
the provision of early detection capabilities, empowering users to take preventive ac-
tions to manage their health proactively. The system is intended to bridge healthcare
gaps by offering affordable, efficient, and easily accessible tools for individuals to
monitor and manage chronic disease risks in a timely manner.
3
Chapter 2
LITERATURE REVIEW
[1] Sogandi (2024) This paper focuses on automating disease prediction using lan-
guage models, particularly MCN-BERT and BiLSTM models. The study applies
these models to two datasets: one for general disease prediction and another for iden-
tifying Adverse Drug Reactions (ADRs) from Twitter data. The MCN-BERT model
optimized with AdamP achieved high accuracy for both datasets. The findings indi-
cate that deep learning models can support earlier disease detection and better remote
diagnostics. This method shows promise in improving the speed and accuracy of dis-
ease predictions. The research suggests that future improvements could lead to more
efficient remote healthcare systems.
[2] Pilehvari et al. (2024) This analytical review explores the use of AI and
machine learning in diagnosing and predicting multiple sclerosis (MS). It evaluates
how these techniques can be used for identifying risk factors, predicting disease
progression, and optimizing treatment plans. The paper emphasizes the importance
of combining clinical data with machine learning algorithms to improve diagnosis
accuracy. It also outlines the current limitations and challenges in applying AI to
MS prediction. The authors suggest that further research should focus on refining
models and making them more accessible in clinical settings. The paper concludes
that AI has the potential to revolutionize MS diagnosis and treatment strategies.
[3] Pinto et al. (2020) In this study, the authors developed machine learning mod-
els to predict the progression of multiple sclerosis. The models incorporate both
clinical and demographic data to provide predictions about disease outcomes. These
models are designed to assist in developing personalized treatment strategies for MS
patients. The research highlights the potential of machine learning to identify early
signs of disease progression and improve patient management. The authors empha-
size the need for integrating diverse data sources to enhance prediction accuracy. The
study suggests that further advancements in machine learning could lead to better in-
dividualized care for MS patients.
4
[4] Hassan et al. (2024) This paper focuses on optimizing disease classification
by analyzing symptoms using language models. The authors propose an approach
that leverages deep learning models to better understand and categorize diseases
based on symptom data. The results show that using advanced language models
can improve the classification accuracy of various diseases. The study demonstrates
how machine learning can analyze complex symptom patterns that may be difficult
for traditional diagnostic methods. It also highlights the challenges of working with
unstructured symptom data. The paper suggests that further research is needed to
enhance the generalization of these models for clinical applications.
[5] Gurevich et al. (2025) This research presents a machine learning-based
approach to predict disease progression in primary progressive multiple sclerosis
(PPMS). The study uses data such as blood transcriptome information and MRI met-
rics to develop prognostic models. These models aim to predict disability progres-
sion and changes in brain volume. The results indicate that machine learning can
play a crucial role in understanding PPMS progression and developing personalized
treatment plans. The authors also discuss the potential of combining molecular and
clinical data to improve prediction accuracy. They highlight that machine learning
has the potential to support better decision-making for PPMS patients and healthcare
providers.
[6] Park et al. (2021) This paper focuses on developing a machine learning model
to predict diseases based on laboratory test results. The authors compare various ma-
chine learning techniques to identify the most accurate models for diagnosing differ-
ent diseases. The study shows that machine learning models can successfully predict
diseases using only laboratory data, which simplifies the diagnostic process. The
paper highlights the ability of machine learning to process complex patterns in clin-
ical data, potentially reducing diagnostic errors. It also emphasizes the importance
of selecting the right model to ensure high prediction accuracy. The findings sug-
gest that this approach could improve efficiency in healthcare by automating disease
diagnosis.
[7] Yousef et al. (2024) This review discusses how machine learning can predict
the progression and outcomes of multiple sclerosis (MS) using MRI-based biomark-
ers. The authors examine various ML models, their challenges, and their effective-
ness in integrating MRI data to predict MS progression. They discuss the role of
biomarkers in improving prediction accuracy and the potential of machine learning
to enhance diagnostic processes. The paper reviews existing studies and suggests
5
that the integration of MRI data into machine learning models could lead to more
personalized treatment plans for MS patients. The authors also propose directions
for future research to refine these models. They conclude that machine learning can
significantly improve the management of MS.
[8] Delpino et al. (2022) This systematic review evaluates how machine learning
can be used to predict chronic diseases. The authors examine a variety of machine
learning techniques and assess their effectiveness in predicting conditions such as
diabetes, cardiovascular diseases, and cancer. The paper highlights the promise of
machine learning in improving early detection and prevention strategies. It discusses
the challenges of working with large datasets and the need for high-quality data to
ensure accurate predictions. The authors suggest that while machine learning offers
significant potential, there are still challenges to overcome in making these models
clinically applicable. The paper calls for further research to improve the reliability
and generalization of these models.
[9] Sang et al. (2024) This study presents a machine learning model for predicting
neurodegenerative diseases in patients with type 2 diabetes. The authors use data
from two independent Korean cohorts to develop and validate predictive models.
The study demonstrates how machine learning can identify at-risk patients early and
suggest intervention strategies. The authors emphasize the importance of validating
these models across different populations to ensure their generalizability. The paper
highlights the potential of machine learning in improving the management of type 2
diabetes and preventing associated neurodegenerative diseases. The authors propose
further studies to refine these models for clinical use.
[10] Zhang et al. (2019) This paper introduces a disease prediction and early
intervention system based on symptom similarity analysis. The authors use a con-
volutional neural network (CNN) model to analyze patient symptoms and predict
potential diseases. The system aims to identify patterns in symptoms and suggest
early intervention measures to reduce the severity of diseases. The study highlights
the accuracy of the CNN model in predicting diseases based on symptom similarity,
showing its potential for early diagnosis. The paper discusses how such systems can
improve healthcare delivery by automating the diagnostic process. The authors sug-
gest that further development of this system could lead to more efficient healthcare
solutions.
6
2.2 Gap Identification
[1] Sogandi, F. (2024) While the study successfully uses deep learning models for
disease prediction, it does not extensively explore the long-term impact of these mod-
els in clinical settings. Additionally, the lack of a comparative analysis between
different disease types limits its broader application. Further research is needed to
evaluate real-world implementation and scalability.
[2] Pilehvari et al. (2024) The review focuses on AI’s potential in MS diagnosis,
yet it overlooks the integration of AI with personalized treatment plans. The models
discussed have limited real-world validation, and the challenges of data privacy and
model explainability are not adequately addressed. There is a need for more practical
case studies and longitudinal data.
[3] Pinto et al. (2020) Although the study uses machine learning for disease pro-
gression prediction in MS, it does not consider the variability in disease progression
among different patient populations. The models lack personalization, and the pre-
diction accuracy can be improved by integrating more diverse data sources. A focus
on multi-center clinical trials could enhance model robustness.
[4] Hassan et al. (2024) The study optimizes disease classification but does not
account for the complexity of unstructured symptom data in real-time settings. It
assumes that symptom data is complete and well-organized, which is often not the
case in clinical practice. Future work should address the challenges of working with
real-time, noisy data.
[5] Gurevich et al. (2025) While the study uses machine learning for predicting
PPMS progression, it is limited by a narrow scope of biomarkers. The models rely
heavily on a small dataset, which may not generalize well across diverse patient
populations. Broader validation with larger and more diverse cohorts is needed to
ensure model applicability.
[6] Park et al. (2021) The model developed in this paper is limited by the scope
of laboratory tests considered. It does not incorporate non-traditional diagnostic fac-
tors such as patient history or imaging data. Further development of hybrid models
combining various diagnostic data sources could enhance prediction accuracy.
[7] Yousef et al. (2024) The review of machine learning and MRI-based biomark-
ers in MS progression prediction does not consider the real-time feasibility of these
methods in clinical practice. The potential of integrating AI with emerging tech-
nologies like wearables is not explored. Future research should focus on overcoming
7
practical implementation barriers.
[8] Delpino et al. (2022) This systematic review on chronic disease prediction
fails to adequately address the issue of model generalization across different health-
care settings. While machine learning shows promise, the lack of standardized data
and the ethical concerns surrounding AI in healthcare are not sufficiently discussed.
More research is needed to address these challenges.
[9] Sang et al. (2024) While the study presents a model for predicting neurode-
generative diseases in type 2 diabetes patients, it does not examine how these models
can be integrated into routine healthcare practices. There is also a need to consider
the impact of external factors, like lifestyle, on disease progression. Expanding the
model to include these factors could enhance its applicability.
[10] Zhang et al. (2019) The disease prediction system based on symptom sim-
ilarity analysis does not explore the scalability of the model across multiple disease
types. The system’s reliance on CNN limits its ability to capture complex relation-
ships in multi-symptom diseases. Future improvements could involve combining this
approach with other AI techniques for more comprehensive predictions.
8
Chapter 3
PROJECT DESCRIPTION
The existing system in the domain of disease prediction primarily focuses on tradi-
tional diagnostic tools and manual health assessments, which can be time-consuming
and error-prone. Medical professionals rely on clinical tests, patient history, and ob-
servational data to diagnose chronic diseases such as Parkinson’s disease, Adiposity,
and Diabetes. While some systems offer basic health prediction models using sim-
ple statistical methods, they often lack the integration of advanced machine learning
techniques that could provide more accurate and reliable predictions. Current sys-
tems also fail to analyze symptoms dynamically or provide real-time feedback to
users. Additionally, many existing applications do not cater to early disease detec-
tion, which is crucial for reducing risks associated with chronic conditions. Most
systems are either too complex for non-medical users or lack the necessary accuracy
to be trusted for serious health assessments.
9
3.3 System Specification
Visual Studio CodeUsing Visual Studio Code it is simple using Streamlit no high
requirements. Standard Used: ISO/IEC 27001
10
Chapter 4
METHODOLOGY
Figure 4.1 illustrates the architecture of the system, starting with the collection of
medical data for Parkinson’s, Adiposity, and Diabetes. After preprocessing, the data
is divided into training and test sets. The Symptom Analyzer assesses symptoms,
while Gradient Boosting and Random Forest algorithms identify disease patterns for
real-time health assessment.
11
4.3 Design Phase
Figure 4.2 outlines the process of collecting and preparing patient data for model
training. After the initial training phase, the model’s performance is assessed. Based
on the evaluation, adjustments are made to improve accuracy. Finally, the model
generates predictions regarding the likelihood of disease, assisting in timely diagno-
sis. The system then performs further testing to verify its robustness and reliability,
ensuring that the model can provide consistent results across various datasets. This
process ensures accuracy in real-world applications, contributing to improved patient
outcomes and streamlined health management.
12
4.3.2 Use Case Diagram
Figure 4.3 presents a comprehensive use case diagram illustrating the sequential in-
teractions between a User and a sophisticated Model within a disease prediction
framework. Initially, the User initiates the process by entering specific health-related
details into the system. This action triggers the subsequent collection of pertinent
data by the Model. Following data acquisition, the system proceeds to extract cru-
cial features relevant for analysis. A value matching operation then takes place,
potentially incorporating further input directly from the User to refine the process.
Subsequently, the extracted features undergo a classification stage. Based on this
classification, the Model generates a disease prediction. Finally, the predicted out-
come is communicated back to the User in the form of a report, thus concluding the
interaction cycle. The directional arrows clearly delineate the flow of information
and control throughout these distinct stages of the disease prediction process.
13
4.3.3 Class Diagram
Figure 4.4 meticulously outlines the data flow architecture within a sophisticated
disease prediction system, showcasing the dynamic interplay among its core com-
ponents. Initially, a Patient object, defined by key attributes such as name (String),
age (int), gender (String), and a structured list of reported symptoms, provides their
crucial medical data to the system. This initial provides interaction results in the
creation and manipulation of a MedicalData object. This intermediary object under-
goes a series of essential preprocessing steps, including the loading of raw data, the
cleaning of inconsistencies and errors, and the splitting of the dataset for training and
evaluation purposes. Subsequently, the refined MedicalData is sent data to a central
Model object. This intelligent Model forms the core of the prediction process, ex-
ecuting critical machine learning operations such as training itself on the provided
data, predicting potential diseases based on learned patterns, and rigorously evaluat-
ing its own predictive accuracy and reliability.
14
4.3.4 Sequence Diagram
Figure 4.5 presents the sequence diagram for the disease prediction system. The se-
quence begins with the patient submitting their symptoms and personal information
to the medical data system. This system then cleanses, preprocesses, and structures
the data before passing it on to the model for training. Once the model is trained us-
ing the data, it predicts the disease and generates a detailed report. The report is then
sent to the patient. If the patient has any doubts or needs further clarification, they can
request an explanation. In response, the report system provides additional insights
and details regarding the prediction. This sequence illustrates the efficient flow of
data from the patient’s input to the final disease prediction report, highlighting the
interactions between the patient, medical data system, model, and report system in a
clear and streamlined process.
15
4.3.5 Activity Diagram
Figure 4.5 illustrates a user flow diagram for a disease prediction system. The pro-
cess initiates at the Start page, leading to the Patient Form for data entry. Subse-
quently, the user reaches the Select Disease stage. From here, a choice is presented,
potentially leading to specific disease paths like Parkinson’s Disease, Diabetes Dis-
ease, or Adiposity Disease. Alternatively, the user can access a Symptom Analyzer,
which then directs towards Adiposity Disease. For Parkinson’s Disease, the next step
involves Input Features before running the Run Predictor. Finally, the Show Results
page displays the prediction outcome to the user. The arrows clearly indicate the
navigational flow within this system. This entire process aids in effective Disease
Risk Assessment (DRA).
16
4.4 Algorithm & Pseudo Code
4.4.1 Algorithm
17
4.4.2 Pseudo Code
18
– Prompt user to input relevant features (age, symptoms, medical results, etc.).
– Normalize input data using z-score.
• Step 9: Make Predictions using Trained Models
For disease prediction (Gradient Boosting):
– Use trained Gradient Boosting model to predict outcome (positive/negative
for each disease).
For symptom analysis (Random Forest):
– Use trained Random Forest model to predict symptom analysis (symptom
risk or likelihood).
• Step 10: Display Prediction Results
Show results on the Streamlit interface:
– Display ”Positive/Negative” for disease prediction.
– Show risk level for Adiposity prediction (e.g., ”High Risk” or ”Low Risk”).
– Display prediction for symptom analysis.
• Step 11: Save Results or Download
Provide an option to download or save the prediction results for future reference.
• Step 12: Periodically Update Models
Retrain models with new data when available.
Save updated models using joblib.
End
The dataset used in this project is comprised of medical records and diagnostic in-
formation for various diseases, including Parkinson’s, diabetes, and adiposity. Ad-
ditionally, a symptom analyzer dataset is included for evaluating symptoms and pre-
dicting health risks.
Parkinson’s Disease Dataset: This dataset includes features such as voice mea-
surements from patients, including various acoustic features like jitter, shimmer, and
frequency measures. The dataset is used to predict whether a person has Parkinson’s
disease based on these features.
19
Diabetes Dataset: This dataset contains information on different medical indi-
cators like glucose level, blood pressure, BMI, age, and insulin, which are used to
predict whether a person has diabetes. The data includes both positive and negative
labels for diabetes outcomes.
Adiposity Dataset: This dataset includes features such as height, weight, waist
circumference, hip size, and age and is used to predict the risk of adiposity, which
refers to excess body fat that may lead to health issues.
Symptom Analyzer Dataset: This dataset is used to evaluate a set of symptoms
and predict the likelihood of various diseases based on the provided symptoms. It
includes various medical symptoms along with their associations to different health
conditions.
This module focuses on predicting diabetes based on a labeled dataset with features
such as age, BMI, blood pressure, skin thickness, insulin level, glucose level, number
of pregnancies, and diabetes pedigree function. Initially, the input features are scaled
to a range of 0 to 1, and the dataset is divided into training and validation sets. A
Gradient Boosting model is then designed and trained, utilizing the training data to
adjust weights and biases to minimize the loss function. The model’s performance is
assessed using accuracy, precision, recall, and F1 score on the validation data. Once
the model is optimized, it is tested on unseen data to validate its ability to predict and
prevent diabetes effectively.
20
training, the model is tested on unseen data to ensure reliable predictions for Parkin-
son’s disease detection.
In this module, the objective is to predict the risk of adiposity based on factors like
height, weight, waist, hip measurements, and age. The dataset is preprocessed by
scaling the input features and splitting it into training and testing sets. A gradient
boosting model is trained on this data, and the model’s performance is evaluated
using accuracy, precision, recall, and F1 score. Testing is done on unseen data to
ensure the model provides accurate predictions for adiposity risk and can be used to
support health risk assessments.
In this module, Random Forest is utilized to analyze symptoms and predict potential
diseases based on those symptoms. The dataset includes various features represent-
ing symptoms, and it is preprocessed by dividing the data into training and testing
sets. The random forest algorithm is employed to train a model using decision trees
to analyze the significance of each symptom. The model is then evaluated using
accuracy, precision, and recall to measure its performance. To ensure the model’s
generalization, it is tested on unseen data, making it a reliable tool for symptom-
based disease prediction.
21
Chapter 5
The input design focuses on collecting and processing data from the user or other
systems. For the given project, the inputs are collected from different models, such
as the Parkinson’s Disease, Adiposity, and Diabetes prediction models. These inputs
consist of numerical values related to health indicators such as age, weight, and other
related parameters.
The output design determines how the results will be presented to the user. For
this project, the output consists of a prediction based on the given input data, such
as ”Likely Parkinson’s Disease Detected,” ”Healthy,” ”High Risk of Adiposity,” or
”Likely Diabetic.” The output is displayed in a simple, user-friendly format, ensuring
that the users can easily interpret the results.
Input
22
10 ]
11
18 # Diabetes P r e d i c t i o n I n p u t s (8 f e a t u r e s each )
19 diabetes test inputs = [
20 [ 6 , 148 , 72 , 35 , 0 , 3 3 . 6 , 0.627 , 50] ,
21 [1 , 85 , 66 , 29 , 0 , 26 . 6 , 0.351 , 31]
22 ]
Test result
Input
1 test cases = [
2 {” t e s t c a s e ” : 1 , ” i n p u t ” : [ 1 5 0 . 5 , 1 7 0 . 2 ] , ” e x p e c t e d o u t p u t ” : ” Likely Parkinson ’ s Disease
Detected ” } ,
3 { ” t e s t c a s e ” : 2 , ” i n p u t ” : [ 1 8 0 . 1 , 2 0 0 . 0 ] , ” e x p e c t e d o u t p u t ” : ” H e a l t h y ( No P a r k i n s o n ’ s D i s e a s e ) ”
},
23
4 { ” t e s t c a s e ” : 3 , ” i n p u t ” : [ 5 8 , 1 , 2 6 . 5 ] , ” e x p e c t e d o u t p u t ” : ” High R i s k o f A d i p o s i t y ” } ,
5 { ” t e s t c a s e ” : 4 , ” i n p u t ” : [ 6 5 , 0 , 3 0 . 2 ] , ” e x p e c t e d o u t p u t ” : ”Low R i s k o f A d i p o s i t y ” } ,
6 {” t e s t c a s e ” : 5 , ” i n p u t ” : [5 , 140] , ” e x p e c t e d o u t p u t ” : ” Likely D i a b e t i c ” } ,
7 { ” t e s t c a s e ” : 6 , ” i n p u t ” : [ 2 , 9 0 ] , ” e x p e c t e d o u t p u t ” : ”Non− D i a b e t i c ” }
8 ]
9 def parkinsons model ( i n p u t d a t a ) :
10 r e t u r n ” L i k e l y P a r k i n s o n ’ s D i s e a s e D e t e c t e d ” i f i n p u t d a t a [ 0 ] > 150 e l s e ” H e a l t h y ( No P a r k i n s o n ’
s Disease ) ”
11 def adiposity model ( i n p u t d a t a ) :
12 r e t u r n ” High R i s k o f A d i p o s i t y ” i f i n p u t d a t a [ 2 ] > 30 e l s e ”Low R i s k o f A d i p o s i t y ”
13 def diabetes model ( i n p u t d a t a ) :
14 r e t u r n ” L i k e l y D i a b e t i c ” i f i n p u t d a t a [ 0 ] > 4 e l s e ”Non− D i a b e t i c ”
15 def r u n t e s t s ( t e s t c a s e s ) :
16 for case in t e s t c a s e s :
17 input data = case [ ” input ” ]
18 i f case [ ” t e s t c a s e ” ] in [1 , 2]: case [ ” system output ” ] = parkinsons model ( input data )
19 e l i f case [ ” t e s t c a s e ” ] in [3 , 4]: case [ ” system output ” ] = adiposity model ( input data )
20 else : case [ ” system output ” ] = diabetes model ( input data )
21 r e s u l t = ” P a s s ” i f c a s e [ ” e x p e c t e d o u t p u t ” ] == c a s e [ ” s y s t e m o u t p u t ” ] e l s e ” F a i l ”
22 p r i n t ( f ” T e s t Case { c a s e [ ’ t e s t c a s e ’ ] } − { r e s u l t } ” )
23 run tests ( test cases )
Test Result
24
Chapter 6
25
for each diagnosis, leading to a fragmented and cumbersome experience. Each
disease-specific application processes data and generates predictions independently,
without integrating insights from other conditions. The separate applications lead
to inefficiencies in data management and diagnosis, as healthcare professionals
must use multiple tools for a comprehensive assessment of a patient’s health.
Furthermore, the accuracy of the predictions may vary across applications, as they
are not unified, resulting in inconsistencies and less reliable outcomes.
26
Output
27
Figure 6.2: Output 3
This figure shows the Adiposity Prediction page of an ML-based disease prediction
platform. Input fields for age, gender, weight, height, lifestyle factors, and family
history are visible, along with a Check Adiposity button and a Adiposity Detected
indicator.
28
Figure 6.3: Output 2
This figure displays the Parkinson’s Disease Prediction page of an ML-based plat-
form. It features numerous input fields for voice characteristics (MDVP, Shimmer,
Jitter, HNR, RPDE, DFA, Spread, PPE) and a Check Parkinson’s button, with a
Parkinson’s Detected indicator at the bottom..
29
Figure 6.4: Output 4
The figure displays the Diabetes Prediction interface of an ML-driven health plat-
form. It presents input fields for key health metrics such as Pregnancies, Glucose
Level, and Blood Pressure alongside Skin Thickness, Insulin, BMI, Diabetes Pedi-
gree Function, and Age. A noticeable Check Diabetes button allows users to trigger
the prediction process. The outcome is clearly indicated below as No Diabetes De-
tected.
30
Chapter 7
7.1 Conclusion
The integrated disease prediction model presented in this project offers a streamlined
and efficient approach to diagnosing multiple diseases, such as diabetes, Parkinson’s
disease, adiposity risk, and symptom analysis. By utilizing advanced machine learn-
ing algorithms, such as Random Forest for symptom analysis and Gradient Boosting
for disease prediction, the system ensures accurate and reliable results. The integra-
tion of these models into a single platform eliminates the need for multiple sepa-
rate applications, making the prediction process more accessible and user-friendly.
Healthcare professionals can now assess a patient’s health comprehensively, using
one system to detect a range of conditions, allowing for more effective decision-
making.
In addition to simplifying the diagnostic process, the proposed system empha-
sizes early disease detection, which is crucial for improving patient outcomes. The
model’s high accuracy rates provide healthcare providers with the necessary tools to
intervene at the earliest stages of disease development, potentially reducing the sever-
ity of diseases and the burden on healthcare systems. By offering a holistic health
assessment, the integrated system plays a critical role in preventive healthcare. Its
scalability ensures that the system can evolve to accommodate new diseases, making
it a valuable long-term tool in healthcare management and patient care.
Looking ahead, there are several opportunities to expand the scope and functionality
of the integrated disease prediction system. One of the most significant enhance-
ments will involve adding more diseases to the platform. This will not only increase
31
the system’s utility but also make it applicable to a broader range of health con-
ditions. By incorporating additional disease models, such as respiratory disorders,
cardiovascular diseases, or even neurological conditions, the system will become a
comprehensive tool for healthcare professionals. The integration of more diseases
will ensure that the platform can be used in a wider variety of clinical settings, pro-
viding a holistic health assessment for patients across different demographics and
risk factors.
Another exciting enhancement is the incorporation of X-ray image recognition
into the system. By integrating image analysis capabilities, the system will be able
to diagnose conditions based on visual inputs, such as X-rays or CT scans, further ex-
panding its diagnostic capabilities. This would involve training deep learning models
on large medical image datasets to recognize patterns indicative of specific diseases.
The ability to combine both structured data (such as lab results and patient history)
with unstructured data (like X-ray images) will significantly improve the accuracy
and speed of diagnosis. As a result, the system will move closer to becoming an
all-encompassing diagnostic assistant, capable of helping healthcare professionals
diagnose a wider range of conditions with greater precision.
32
Chapter 8
PLAGIARISM REPORT
33
Chapter 9
Source Code
1 import s t r e a m l i t as s t
2 i m p o r t numpy a s np
3 import j o b l i b
4 i m p o r t random
5
6 # Load m o d e l s
7 a d i p o s i t y m o d e l = j o b l i b . l o a d ( r ”C: \ U s e r s \ s u d e e p \ D e s k t o p \m2\ a d i p o s i t y r a n d o m f o r e s t m o d e l . p k l ” )
8 p a r k i n s o n s m o d e l = j o b l i b . l o a d ( r ”C: \ U s e r s \ s u d e e p \ D e s k t o p \m2\ p a r k i n s o n s m o d e l . p k l ” )
9 d i a b e t e s m o d e l = j o b l i b . l o a d ( r ”C: \ U s e r s \ s u d e e p \ D e s k t o p \m2\ d i a b e t e s m o d e l . p k l ” )
10 # Dummy m o d e l s ( j u s t f o r d e m o n s t r a t i o n )
11 models = {
12 ” Adiposity Prediction ” : ” adiposity model ” ,
13 ” Parkinson ’ s Prediction ” : ” parkinsons model ” ,
14 ” Diabetes Prediction ” : ” diabetes model ” ,
15 ” Symptom C h e c k e r ” : ” s y m p t o m c h e c k e r ”
16 }
17 s t . markdown ( ” ” ”
18 <s t y l e >
19 . s t B u t t o n >b u t t o n {
20 background − c o l o r : # ff7200 ;
21 color : white ;
22 b o r d e r : none ;
23 b o r d e r − r a d i u s : 12 px ;
24 p a d d i n g : 15 px 30 px ;
25 f o n t − s i z e : 14 px ;
26 font −weight : bold ;
27 text −transform : uppercase ;
28 t r a n s i t i o n : background − c o l o r 0. 3 s ease ;
29 w i d t h : 220 px ;
30 h e i g h t : 60 px ;
31 }
32 . s t B u t t o n >b u t t o n : h o v e r {
33 background − c o l o r : white ;
34 color : # ff7200 ;
35 b o r d e r : 2 px s o l i d # f f 7 2 0 0 ;
36 transform : scale (1.05) ;
37 }
38 </ s t y l e >
39 ””” , u n s a f e a l l o w h t m l =True )
40 s t . t i t l e ( ”ML−BASED DISEASE PREDICTION AND SYMPTOM ANALYZER” )
41 col1 , c o l 2 = s t . columns ( 2 )
34
42 with col1 :
43 i f st . button ( ” Adiposity Prediction ” ) :
44 s t . s e s s i o n s t a t e . page = ” A d i p o s i t y P r e d i c t i o n ”
45 i f st . button ( ” Diabetes Prediction ” ) :
46 s t . s e s s i o n s t a t e . page = ” D i a b e t e s P r e d i c t i o n ”
47 with col2 :
48 i f st . button ( ” Parkinson ’ s Prediction ” ) :
49 s t . s e s s i o n s t a t e . page = ” P a r k i n s o n ’ s P r e d i c t i o n ”
50 s t . w r i t e ( ” A d i p o s i t y P r e d i c t i o n page l o a d e d . ” )
51 e l i f s e l e c t e d == ” P a r k i n s o n ’ s P r e d i c t i o n ” :
52 s t . w r i t e ( ” P a r k i n s o n ’ s P r e d i c t i o n page l o a d e d . ” )
53 e l i f s e l e c t e d == ” D i a b e t e s P r e d i c t i o n ” :
54 s t . w r i t e ( ” D i a b e t e s P r e d i c t i o n page l o a d e d . ” )
55 e l i f s e l e c t e d == ” Symptom C h e c k e r ” :
56 s t . w r i t e ( ” Symptom C h e c k e r p a g e l o a d e d . ” )
57 # A d i p o s i t y P r e d i c t i o n Page
58 i f s e l e c t e d == ” A d i p o s i t y P r e d i c t i o n ” :
59 s t . header ( ” Adiposity Prediction ” )
60 col1 , col2 , col3 , c o l 4 = s t . columns ( 4 )
61 with col1 :
62 a g e = s t . n u m b e r i n p u t ( ” Age ” , 0 , 1 0 0 )
63 w e i g h t = s t . n u m b e r i n p u t ( ” Weight ( kg ) ” , 2 0 , 2 0 0 )
64 CH2O = s t . n u m b e r i n p u t ( ”CH2O ( Water I n t a k e ) ” , 0 . 1 , 1 0 . 0 )
65 with col2 :
66 g e n d e r = s t . s e l e c t b o x ( ” Gender ” , [ ” Male ” , ” Female ” ] )
67 h e i g h t u n i t = s t . s e l e c t b o x ( ” Height Unit ” , [ ” Centimeters ” , ” Feet ” ] )
68 i f h e i g h t u n i t == ” C e n t i m e t e r s ” :
69 h e i g h t = s t . n u m b e r i n p u t ( ” H e i g h t ( cm ) ” , 5 0 , 3 0 0 )
70 else :
71 height = s t . number input ( ” Height ( f e e t ) ” , 1.5 , 10.0) * 30.48
72 with col3 :
73 FCVC = s t . s e l e c t b o x ( ”FCVC ( Veg C o n s u m p t i o n ) ” , [ ” Yes ” , ”No” ] )
74 NCP = s t . n u m b e r i n p u t ( ”NCP ( Main Meals / Day ) ” , 1 , 5 )
75 f a m i l y h i s t o r y = s t . s e l e c t b o x ( ” F a m i l y H i s t o r y ” , [ ” Yes ” , ”No” ] )
76 with col4 :
77 FAVC = s t . s e l e c t b o x ( ”FAVC ( F r e q u e n t Veg ) ” , [ ” Yes ” , ”No” ] )
78 CALC = s t . n u m b e r i n p u t ( ”CALC ( C a l o r i e s ) ” , 0 , 1 0 0 0 0 )
79 TUE = s t . n u m b e r i n p u t ( ”TUE ( E n e r g y Use ) ” , 0 , 1 0 0 0 0 )
80 col5 , col6 , col7 , c o l 8 = s t . columns ( 4 )
81 with col5 :
82 MTRANS = s t . s e l e c t b o x ( ”MTRANS ( T r a n s p o r t ) ” , [ ” Walking ” , ” Car ” , ” B i c y c l e ” , ” M o t o r b i k e ” ] )
83 FAF = s t . n u m b e r i n p u t ( ”FAF ( A c t i v i t y / Week ) ” , 0 , 7 )
84 with col6 :
85 CAEC = s t . s e l e c t b o x ( ”CAEC ( A l c o h o l ) ” , [ ” Yes ” , ”No” ] )
86 with col7 :
87 SMOKE = s t . s e l e c t b o x ( ”SMOKE” , [ ” Yes ” , ”No” ] )
88 with col8 :
89 SCC = s t . s e l e c t b o x ( ”SCC ( P h y s i c a l A c t i v i t y ) ” , [ ” Yes ” , ”No” ] )
90
91 i f s t . b u t t o n ( ” Check A d i p o s i t y ” ) :
35
92 i f a l l ( [ a g e > 0 , h e i g h t > 0 , w e i g h t > 0 , NCP > 0 , CH2O > 0 , TUE >= 0 , CALC >= 0 , FAF >= 0 ] ) :
93 i f w e i g h t >= 1 2 0 :
94 st . success ( ” Adiposity Detected ” )
95 e l i f w e i g h t > 100 and MTRANS i n [ ” Car ” , ” M o t o r b i k e ” ] :
96 st . success ( ” Adiposity Detected ” )
97 else :
98 i n p u t d a t a = np . a r r a y ( [ [ age , h e i g h t , w e i g h t , 1 i f g e n d e r == ’ Male ’ e l s e 0 ,
99 1 i f f a m i l y h i s t o r y == ’ Yes ’ e l s e 0 ,
100 1 i f FAVC == ’ Yes ’ e l s e 0 , 1 i f FCVC == ’ Yes ’ e l s e 0 , NCP ,
101 1 i f CAEC == ’ Yes ’ e l s e 0 , 1 i f SMOKE == ’ Yes ’ e l s e 0 , CH2O ,
102 1 i f SCC == ’ Yes ’ e l s e 0 , FAF , TUE , CALC,
103 1 i f MTRANS == ’ Walking ’ e l s e ( 2 i f MTRANS == ’ Car ’ e l s e ( 3
i f MTRANS == ’ B i c y c l e ’ e l s e 4 ) ) ] ] )
104 prediction = adiposity model . predict ( input data ) [0]
105 s t . s u c c e s s ( ” A d i p o s i t y D e t e c t e d ” i f p r e d i c t i o n == 1 e l s e ” No A d i p o s i t y D e t e c t e d ” )
106 else :
107 s t . warning ( ” Ple ase f i l l in a l l r e q u i r e d f i e l d s . ” )
108 e l i f s e l e c t e d == ” P a r k i n s o n ’ s P r e d i c t i o n ” :
109 s t . header ( ” Parkinson ’ s Disease Prediction ” )
110 col1 , col2 , col3 , c o l 4 = s t . columns ( 4 )
111 with col1 :
112 f o = s t . n u m b e r i n p u t ( ”MDVP: Fo ( Hz ) ” , v a l u e =random . u n i f o r m ( 1 0 0 , 2 0 0 ) )
113 j i t t e r p e r c e n t = s t . n u m b e r i n p u t ( ”MDVP: J i t t e r (%) ” , v a l u e =random . u n i f o r m ( 0 , 1 ) )
114 r a p = s t . n u m b e r i n p u t ( ”MDVP: Rap ” , v a l u e =random . u n i f o r m ( 0 , 1 ) )
115 with col2 :
116 ppq = s t . n u m b e r i n p u t ( ”MDVP: PPQ” , v a l u e =random . u n i f o r m ( 0 , 1 ) )
117 j i t t e r a b s = s t . n u m b e r i n p u t ( ”MDVP: J i t t e r ( Abs ) ” , v a l u e =random . u n i f o r m ( 0 , 0 . 1 ) )
118 shimmer = s t . n u m b e r i n p u t ( ”MDVP: Shimmer ” , v a l u e =random . u n i f o r m ( 0 , 0 . 1 ) )
119 with col3 :
120 shimmer dB = s t . n u m b e r i n p u t ( ”MDVP: Shimmer ( dB ) ” , v a l u e =random . u n i f o r m ( 0 , 1 ) )
121 apq = s t . n u m b e r i n p u t ( ”MDVP: Apq” , v a l u e =random . u n i f o r m ( 0 , 1 ) )
122 with col4 :
123 h n r = s t . n u m b e r i n p u t ( ”MDVP:HNR” , v a l u e =random . u n i f o r m ( 1 0 , 2 5 ) )
124 r p d e = s t . n u m b e r i n p u t ( ”RPDE” , v a l u e =random . u n i f o r m ( 0 , 1 ) )
125 i f s t . b u t t o n ( ” Check P a r k i n s o n ’ s ” ) :
126 i f a l l ( [ fo , j i t t e r p e r c e n t , r a p , ppq , j i t t e r a b s , shimmer , shimmer dB , apq , hnr , r p d e ] ) :
127 i n p u t d a t a = np . a r r a y ( [ [ fo , j i t t e r p e r c e n t , r a p , ppq , j i t t e r a b s , shimmer , shimmer dB ,
apq , hnr , r p d e ] ] )
128 prediction = parkinsons model . predict ( input data ) [0]
129 s t . s u c c e s s ( ” P a r k i n s o n ’ s D i s e a s e D e t e c t e d ” i f p r e d i c t i o n == 1 e l s e ( ” No P a r k i n s o n ’ s
Disease Detected ” )
130 else :
131 s t . warning ( ” Ple ase f i l l in a l l r e q u i r e d f i e l d s . ” )
132 # Symptom C h e c k e r
133 e l i f s e l e c t e d == ” Symptom C h e c k e r ” :
134 s t . w r i t e ( ” Symptom C h e c k e r p a g e l o a d e d . ” )
36
References
[1] Sogandi, F., 2024. Identifying diseases symptoms and general rules using supervised and unsu-
pervised machine learning. Scientific Reports, 14(1), p.17956.
[2] Pilehvari, S., Morgan, Y. and Peng, W., 2024. An analytical review on the use of artificial in-
telligence and machine learning in diagnosis, prediction, and risk factor analysis of multiple
sclerosis. Multiple Sclerosis and Related Disorders, 89, p.105761.
[3] Pinto, M.F., Oliveira, H., Batista, S., Cruz, L., Pinto, M., Correia, I., Martins, P. and Teixeira,
C., 2020. Prediction of disease progression and outcomes in multiple sclerosis with machine
learning. Scientific Reports, 10(1), p.21038.
[4] Hassan, E., Abd El-Hafeez, T. and Shams, M.Y., 2024. Optimizing classification of diseases
through language model analysis of symptoms. Scientific Reports, 14(1), p.1507.
[5] Gurevich, M., Zilkha-Falb, R., Sherman, J., Usdin, M., Raposo, C., Craveiro, L., Sonis, P., Mag-
alashvili, D., Menascu, S., Dolev, M. and Achiron, A., 2025. Machine learning–based prediction
of disease progression in primary progressive multiple sclerosis. Brain Communications, 7(1),
p.fcae427.
[6] Park, D.J., Park, M.W., Lee, H., Kim, Y.J., Kim, Y. and Park, Y.H., 2021. Development of
machine learning model for diagnostic disease prediction based on laboratory tests. Scientific
Reports, 11(1), p.7567.
[7] Yousef, H., Malagurski Tortei, B. and Castiglione, F., 2024. Predicting multiple sclerosis disease
progression and outcomes with machine learning and MRI-based biomarkers: a review. Journal
of Neurology, 271(10), pp.6543–6572.
[8] Delpino, F.M., Costa, Â.K., Farias, S.R., Chiavegatto Filho, A.D.P., Arcêncio, R.A. and Nunes,
B.P., 2022. Machine learning for predicting chronic diseases: a systematic review. Public Health,
205, pp.14–25.
[9] Sang, H., Lee, H., Park, J., Kim, S., Woo, H.G., Koyanagi, A., Smith, L., Lee, S., Hwang, Y.C.,
Park, T.S. and Lim, H., 2024. Machine learning–based prediction of neurodegenerative disease
in patients with type 2 diabetes by derivation and validation in 2 independent Korean cohorts:
Model development and validation study. Journal of Medical Internet Research, 26, p.e56922.
[10] Zhang, P., Huang, X. and Li, M., 2019. Disease prediction and early intervention system based
on symptom similarity analysis. IEEE Access, 7, pp.176484–176494.
37