SlideShare a Scribd company logo
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
DOI : 10.5121/ijfls.2017.7302 15
GENETIC ALGORITHM (GA) OPTIMIZATION USING
DIABETES EXPERIMENTAL DATA
Ejiofor C. I1
&Laud Charles Ochei2
Department of Computer Science, University of Port-Harcourt, Port-Harcourt. Nigeria 1
Robert Gordon University, Aberdeen, United Kingdom,2
Abstract
Diabetes comprises of noisy features. This feature hampers classification and prediction for Artificial
Intelligence (AI) system. The optimization of diabetes dataset using Genetic Algorithm (GA) exploring its
fundamentals identifies the focus of this research. The dataset obtained from Biostat comprising of random
samples (fifteen: 15) and parameter variables five: cholestrol, high-density lipoprotein, age, height and
weight was used for the optimization. The simulation was Matrix Laboratory (MATLB). The optimized
dataset was validated using standard optimization equation resulting in percentage score of Forty-one
(41%) percent. This dataset will be using in classifying fuzzy system
Keywords
Genetic Algorithm, Diabetes.
1.INTRODUCTION
Diabetes is a condition where the body fails to utilize the ingested glucose properly. Diabetes is
caused by deficiency in insulin production or failure of the body to response to insulin production
(Ananya, 2017). Excess glucose overtime within the blood stream can result in eye, kidney and
nerve damage. It can also inflame heart disease, stroke and even amputation.
Diabetes has been identified as the fastest growing long term disease that affects millions of
people worldwide(MedlinePlus, 2017). The statisticindeed has been alarming across the globe. In
2013 it was estimated that over 382 million people worldwide were suffering from diabetes. In
the United Kingdom, 2million persons have been identified as suffering from diabetes with 750,
000 unaware of their current illness. In the United States 25.8 million people or 8.3% of its
population have been identified as diabetes sufferer with70 million remaining undiagnosed. In
2010, about 1.9 million new cases of diabetes were also identified in United State with a future
prediction of 1 in every 3 Americans having the chance of experiencing diabetes by 2050
(Ananya, 2013).
Diabetes can be categorized into: Type I and Type II. Type I diabetes is also known as insulin-
dependent diabetes mellitus and is usually associated with younger adolescence.This form of
diabetes is dueto the inability of the body to produce insulin as a result of autoimmune disorder
destroying the pancreases and eliminating the chance of insulin production. Approximately 10%
of diabetes diagnoses are associated with Type I.Type II diabetes; also known as adult-onset or
noninsulin-dependent diabetes. This form of diabetes exists due to failure of the pancreas to
produce enough insulin to metabolize glucose which is usually associated with
aging.Approximately 90% of all cases of diabetes worldwide are associated with Type II
(Medlinplus, 2017).
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
16
Diabetes symptoms vary from frequent urination, intense thirst and hunger, weight gain, unusual
weight loss, fatigue, male sexual dysfunction, numbness and tingling in hands and feet
(MedlinPlus, 2017 and Ananya, 2017).
Treatment and diagnosis of diabetes usually are complex with physician depending on patient’s
symptoms in collaboration with Age, weight, height, family history and the contributing factor of
alcoholism and lack of physical exercise in identifying type I or II diabetes. While the diagnoses
have been consistent over time, the complication experienced by novel physician and the non-
availability of experienced expert has fostered numerous Artificial Intelligence (AI) models.
Although these model has been employed for the prediction of diabetes built to complement the
conventional approaches of physician-patient interaction (Mehdi et al., 2012; Meysam and
Mahdi, 2016)these models, possibly have suffer for inaccuracies due to noisy training sample
which has hampered training cases
This research paper explores genetic algorithm optimization technique for preprocessing diabetes
datasetidentifying change variation within the dataset. The variation in change will be explored
using standard optimization equation.
2.GENETIC ALGORITHM (GA) OVERVIEW
Genetic Algorithm (GA) considered a search and optimization technique based on the adaptation
of natural selection process and Evolutionary Algorithms (EA) has found its mark in Artificial
intelligence problem domains. GA provides a framework for solving both constrained and
unconstrained optimization problems based on a natural selection process which mimics
biological evolution (Akbari, 2010). GA is used in arriving at optimal solutions; solutions
directing the optimization to the best possible area (Eiben, 1994) through the modification of
population of individual solutions.
GA usually creates an optimization process using an initial population. This population
encompasses solution seen as candidates with each candidate’s solution possess initial possible
solutions. In each generation, the fitness of each individual are usually expressed and examined
using the objective function created in most cases based on adaptation, user examine,
experimental design and trial error (Son et al., 2016). The fittest individual are stochastically
selected from the current population with each genome modified based on genetic operator. This
modification creates new generation which are repeatedly evaluated. Commonly, the algorithm
terminates when either a maximum number of generations has been produced, or a satisfactory
fitness level has been reached for the population (Harik et al., 2006).
GA population size depends largely on the problem domain. This population usually combined
numerous possible solutions which are generated randomly forming the search or state space.
These solutions are usually attuned toward better solutions (Taherdangkoo et al., 2012).
Successive generation solutions are combined with preceding generation to breed a new
generation. Individual solutions are selected through a fitness-based process,
where fitter solutions are typically more likely to be selected. Novel solutions are produced using
parent solution in breeding pool of successive generation by producing children solution (Coffin
and Robert, 2008). The optimization process usually terminate based on minimum criteria
satisfaction, allocated budget and highest fitness value (Echegoyen et al., 2012).
Solution encoding, objective function identification, identified solution, application of operator
and termination are indeed the fundamental components of GA (Coffin and Robert, 2008; Akbari,
2010 Echegoyen et al., 2012).
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
17
3.METHODOLOGY:DIABETES DATASET OPTIMIZATION USING MATRIX
LABORATORY (MATLAB)
The dataset for simulation was obtained from Biostat. This data served as the experimental data
for optimization. The dataset comprises of fifteen samples spread across five decisionvariables:
cholestrol, high-density lipoprotein, age, height and weight. Table 3.1 depicts the diabetes
simulation data.
Table 3.1: Un-Optimized Diabetes Dataset
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
18
3.1MATLAB GA SIMULATIONS
Figure 3.1: GA-Simulation Chart
The chart of Figure3.1 captures the fundamentals of diabetes GA simulation. It shows clearly that
five input variable were utilized in collaboration with fifteen samples. The population type for
simulation was explored as double while the function was uniformly created. The chart also
depicts the weighted score for each column parameter input. The simulations chart also capture
best fitness value, best individual value, genealogy, score and selection.
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
19
Figure 3.2: Generation 1 (initial) GA-Simulation chart
Figure 3.2 provides the first generation simulation identifying fitness value; with the best fitness
value at generation one shown as 343and the over means of 15 samples identified as 506. The
individual generation is identified as initial. The selection function picture the probability open to
each sample in selecting prospecting children for mating within the next generation, with the fifth
individual having the highest section probability of 6. The current best individual provides the
best individual score per generation while the fitness score for each individual can be identified
successively for each generation. For this generation an average fitness score less than 600 was
initially maintained.
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
20
Figure 3.3: Generation 15 (final) GA-Simulation chart
Figure 3.3 provides the final generation simulation identifying fitness value; with the best fitness
value at generation fifteen shown as 307 and the over means of 15 samples identified as 301. The
individual generation is identified as fifteen. The selection function picture the probability open to
each sample in selecting prospecting children for mating within the next generation, with the fifth
individual having the highest section probability of 7. The current best individual provides the
best individual score per generation while the fitness score for each individual can be identified
successively for each generation. For this generation an average fitness score above 300 was
maintained.
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
21
Table 3.2: Optimized Diabetes Dataset
Table 3.2 provides the change obtained succeeding optimization. The values cut across
cholesterol, high density lipoporotein, Age, height and weight and showed the variation in data
changed compared to previous data value appearing on table 3.1.
4.VALIDATION OF OPTIMIZED DIABETES DATASET
Validation provides a definite proof in ascertaining the variation in change and determining
percentage optimization. It measures how much the fundamental components have been
optimized. Equation 4.1 determines these changes.
Where
R0= summation of fitness values of optimized dataset
M0=summation of fitness values of the non-optimized dataset
Table 4.2, captures the non–optimized and the optimized dataset the dataset are exemplified from
case 1-10 and 110.
Case Cholesterol
High-
Density
Lipoprotein AGE Height Weight
Weighted
Score Status
Confusion
Matrix
Case 1 78 12 67 67 119 343 Type II TP
Case 2 78 12 67 67 119 343 Type II FP
Case 3 78 36 27 67 119 327 Type II TP
Case 4 78 12 40 59 121 310 Type II TP
Case 5 78 12 36 67 119 312 Type II TP
Case 6 79 13 27 62 119 300 Type II TN
Case 7 78 12 27 66 119 302 Type II TN
Case 8 78 12 67 66 119 342 Type II TN
Case 9 78 12 45 62 119 316 Type II TP
Case 10 78 12 37 69 119 315 Type II TP
Case 11 78 12 27 67 119 303 Type II TP
Case 12 78 12 29 64 119 302 Type II TN
Case 13 78 12 34 67 119 310 Type II TN
Case 14 78 12 36 59 119 303 Type II TP
Case 15 78 12 29 63 119 300 Type II TN
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
22
Table 4.3: Non- Optimized and Optimized fitness Values
Degree of Optimization (DoO) = [4728 - 8017]/8017
= 0.4102
= 0.4102 * 100
= 41.1%
The degree of optimization shows clearly that 41% variation change has occurred within the
dataset that has been optimized. The graph of figure 4.1 graphical depicts this percentage
change.This variation in change has improved the given dataset and subsequently eliminated
noisy features.
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
23
Figure 4.1: Degree of Change for Diabetes data
The graph of figure 4.1 provides a percentage change of 40%. This change is perceived as the
stall change value. The point at which change optimization on the same dataset could be
stochastic possible as it run into infinite
5. CONCLUSION
This research has established the usefulness of genetic Algorithm as an optimization tool in
ascertaining optimal samples for used for prediction. The data obtained from biostat have been
optimized using an appropriate objective function and associated fitness values. Matric laboratory
interface provided simulation interfaces captured variation change within the dataset. The
validation using standard optimization equation captured an optimization change of 41%. This
dataset will be utilized probably in provided classification for fuzzy system.
REFERENCES
[1] Akbari Z. (2010). "A multilevel evolutionary algorithm for optimizing numerical functions" IJIEC 2
(2011): 419–430
[2] Ananya (2017), What is Diabetes, retrieved online from https://www.news-medical.net/health/What-
is-Diabetes.aspx
[3] Coffin, D.; S., Robert E. (2008). "Linkage Learning in Estimation of Distribution Algorithms".
Linkage in Evolutionary Computation. Springer Berlin Heidelberg: 141–156. doi:10.1007/978-3-540-
85068-7_7.
[4] Eiben, A. E. et al (1994). Genetic algorithms with multi-parent recombination, PPSN III: Proceedings
of the International Conference on Evolutionary Computation. The Third Conference on Parallel
Problem Solving from Nature: 78–87. ISBN 3-540-58484-6.
International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017
24
[5] Echegoyen, C.; Mendiburu, A. Santana, R.; Lozano, J. A. (2012). "On the Taxonomy of Optimization
Problems under Estimation of Distribution Algorithms". Evolutionary Computation. 21 (3): 471–495.
ISSN 1063-6560. doi:10.1162/EVCO_a_00095.
[6] Harik G. R.; Lobo, F. G.; Sastry, K. (2006), Linkage Learning via Probabilistic Modeling in the
Extended Compact Genetic Algorithm (ECGA),Scalable Optimization via Probabilistic Modeling
[7] Springer Berlin Heidelberg: 39–61. doi:10.1007/978-3-540-34954-9_3.
[8] MedlinePlus (2017), Diabetes, retrieved online from http:// www.medlineplus.com
[9] Mehdi K., Saeede E. and Jamshid P. (2012), Diagnosing Diabetes Type II Using a Soft Intelligent
Binary Classification Model, Review of Bioinformatics and Biometrics (RBB) Volume 1 Issue 1,
December 2012 9-23.
[10] Meysam J., and Mahdi M. (2016), Comparison of Predictive Models for the Early Diagnosis of
Diabetes, Kournal of Health Information Research, Vol 22(2), Pp.95-100
[11] Son D. D., Kazem A. and Romeo M. (2016), Maximsing Performance of Genetic Algorithm Solver in
Matlab, Advance online publication:, Pp. 1-9
[12] Taherdangkoo, M.; Paziresh, M.; Yazdi, M.; Bagheri, M. H. (2012). An efficient algorithm for
function optimization: modified stem cells algorithm, Central European Journal of Engineering. 3 (1):
36–50. doi:10.2478/s13531-012-0047-8.

More Related Content

What's hot (14)

PDF
Cancer prognosis prediction using balanced stratified sampling
ijscai
 
PDF
working_example_poster
Huikun Zhang
 
PDF
Comparative study of artificial neural network based classification for liver...
Alexander Decker
 
PDF
Diabetes Prediction by Supervised and Unsupervised Approaches with Feature Se...
IJARIIT
 
PDF
Crimson Publishers-Allometry Scalling in Drug Development
CrimsonPublishersBioavailability
 
PDF
Supervised machine learning based liver disease prediction approach with LASS...
journalBEEI
 
PDF
Significance of integrated taxonomy approach in
IAEME Publication
 
PDF
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
IJCSEIT Journal
 
PDF
Pirch_Latisse Core Deck Final
Sarah Pirch
 
PPTX
Dana Choi | Post harvest quality evaluation system on conveyor belt for mecha...
danachoi_com
 
PDF
Estimating the Survival Function of HIV AIDS Patients using Weibull Model
ijtsrd
 
PDF
clinical._pharmacology._ESSAY
Dimitrios Brachos
 
PDF
The Utilization of Physics Parameter to Classify Histopathology Types of Inva...
IJECEIAES
 
PDF
Mobile Decision Support System to Determine Toddler's Nutrition using Fuzzy S...
IJECEIAES
 
Cancer prognosis prediction using balanced stratified sampling
ijscai
 
working_example_poster
Huikun Zhang
 
Comparative study of artificial neural network based classification for liver...
Alexander Decker
 
Diabetes Prediction by Supervised and Unsupervised Approaches with Feature Se...
IJARIIT
 
Crimson Publishers-Allometry Scalling in Drug Development
CrimsonPublishersBioavailability
 
Supervised machine learning based liver disease prediction approach with LASS...
journalBEEI
 
Significance of integrated taxonomy approach in
IAEME Publication
 
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
IJCSEIT Journal
 
Pirch_Latisse Core Deck Final
Sarah Pirch
 
Dana Choi | Post harvest quality evaluation system on conveyor belt for mecha...
danachoi_com
 
Estimating the Survival Function of HIV AIDS Patients using Weibull Model
ijtsrd
 
clinical._pharmacology._ESSAY
Dimitrios Brachos
 
The Utilization of Physics Parameter to Classify Histopathology Types of Inva...
IJECEIAES
 
Mobile Decision Support System to Determine Toddler's Nutrition using Fuzzy S...
IJECEIAES
 

Similar to GENETIC ALGORITHM (GA) OPTIMIZATION USING DIABETES EXPERIMENTAL DATA (20)

PDF
A genetic algorithm-based feature selection approach for diabetes prediction
IAESIJAI
 
PDF
IRJET- Diabetes Diagnosis using Machine Learning Algorithms
IRJET Journal
 
PDF
IRJET- Comparison of Techniques for Diabetes Detection in Females using Machi...
IRJET Journal
 
PDF
ML In Predicting Diabetes In The Early Stage
IRJET Journal
 
PDF
IRJET - Prediction and Detection of Diabetes using Machine Learning
IRJET Journal
 
PPTX
Early stage of diabetics prediction using machine learnin
VinothVinoth618840
 
PDF
Forecasting Diabetes Mellitus at an Initial Stage using Machine Learning Methods
IRJET Journal
 
PPTX
presentation of miniproject of ppt presentation
nandeshgowda134
 
PDF
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
IAEMEPublication
 
PDF
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
IAEME Publication
 
PDF
Python scikit-fuzzy: developing a fuzzy expert system for diabetes diagnosis
IAESIJAI
 
PDF
Diagnosis of diabetes using classification mining techniques [
IJDKP
 
PPTX
DiabetesPPT.pptx
Ganesh536528
 
PDF
Machine Learning Approaches for Diabetes Classification
IRJET Journal
 
PDF
Artificial Intelligence Approaches for Predicting Diabetes in Egypt
gerogepatton
 
PDF
Artificial Intelligence Approaches for Predicting Diabetes in Egypt
gerogepatton
 
PDF
Early Stage Diabetic Disease Prediction and Risk Minimization using Machine L...
IRJET Journal
 
PDF
Implementation of a Web Application to Foresee and Pretreat Diabetes Mellitus...
IRJET Journal
 
PDF
Disease prediction in big data healthcare using extended convolutional neural...
IJAAS Team
 
PDF
IRJET- Diabetes Prediction using Machine Learning
IRJET Journal
 
A genetic algorithm-based feature selection approach for diabetes prediction
IAESIJAI
 
IRJET- Diabetes Diagnosis using Machine Learning Algorithms
IRJET Journal
 
IRJET- Comparison of Techniques for Diabetes Detection in Females using Machi...
IRJET Journal
 
ML In Predicting Diabetes In The Early Stage
IRJET Journal
 
IRJET - Prediction and Detection of Diabetes using Machine Learning
IRJET Journal
 
Early stage of diabetics prediction using machine learnin
VinothVinoth618840
 
Forecasting Diabetes Mellitus at an Initial Stage using Machine Learning Methods
IRJET Journal
 
presentation of miniproject of ppt presentation
nandeshgowda134
 
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
IAEMEPublication
 
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
IAEME Publication
 
Python scikit-fuzzy: developing a fuzzy expert system for diabetes diagnosis
IAESIJAI
 
Diagnosis of diabetes using classification mining techniques [
IJDKP
 
DiabetesPPT.pptx
Ganesh536528
 
Machine Learning Approaches for Diabetes Classification
IRJET Journal
 
Artificial Intelligence Approaches for Predicting Diabetes in Egypt
gerogepatton
 
Artificial Intelligence Approaches for Predicting Diabetes in Egypt
gerogepatton
 
Early Stage Diabetic Disease Prediction and Risk Minimization using Machine L...
IRJET Journal
 
Implementation of a Web Application to Foresee and Pretreat Diabetes Mellitus...
IRJET Journal
 
Disease prediction in big data healthcare using extended convolutional neural...
IJAAS Team
 
IRJET- Diabetes Prediction using Machine Learning
IRJET Journal
 
Ad

More from Wireilla (20)

PDF
DOUBT INTUITIONISTIC FUZZY IDEALS IN BCK/BCI-ALGEBRAS
Wireilla
 
PDF
CUBIC STRUCTURES OF MEDIAL IDEAL ON BCI -ALGEBRAS
Wireilla
 
PDF
α -ANTI FUZZY NEW IDEAL OF PUALGEBRA
Wireilla
 
PDF
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
Wireilla
 
PDF
DESIGN OF OBSERVER BASED QUASI DECENTRALIZED FUZZY LOAD FREQUENCY CONTROLLER ...
Wireilla
 
PDF
International Journal of Fuzzy Logic Systems (IJFLS)
Wireilla
 
PDF
EFFECTIVE REDIRECTING OF THE MOBILE ROBOT IN A MESSED ENVIRONMENT BASED ON TH...
Wireilla
 
PDF
APPROXIMATE CONTROLLABILITY RESULTS FOR IMPULSIVE LINEAR FUZZY STOCHASTIC DIF...
Wireilla
 
PDF
COMPARISON OF DIFFERENT APPROXIMATIONS OF FUZZY NUMBERS
Wireilla
 
PDF
A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...
Wireilla
 
PDF
STABILITY ENHANCEMENT OF POWER SYSTEM USING TYPE-2 FUZZY LOGIC POWER SYSTEM S...
Wireilla
 
PDF
STATISTICAL ANALYSIS OF FUZZY LINEAR REGRESSION MODEL BASED ON DIFFERENT DIST...
Wireilla
 
PDF
FUZZY LOAD FREQUENCY CONTROLLER IN DEREGULATED POWER ENVIRONMENT BY PRINCIPAL...
Wireilla
 
PDF
FUZZY LOGIC CONTROL OF A HYBRID ENERGY STORAGE MODULE FOR NAVAL PULSED POWER ...
Wireilla
 
PDF
A COUNTEREXAMPLE TO THE FORWARD RECURSION IN FUZZY CRITICAL PATH ANALYSIS UND...
Wireilla
 
PDF
IMPLEMENTATION OF FUZZY CONTROLLED PHOTO VOLTAIC FED DYNAMIC VOLTAGE RESTORER...
Wireilla
 
PDF
FUZZY CLUSTERING BASED SEGMENTATION OF VERTEBRAE IN T1-WEIGHTED SPINAL MR IMA...
Wireilla
 
PDF
OPTIMAL ALTERNATIVE SELECTION USING MOORA IN INDUSTRIAL SECTOR - A REVIEW
Wireilla
 
PDF
WAVELET- FUZZY BASED MULTI TERMINAL TRANSMISSION SYSTEM PROTECTION SCHEME IN ...
Wireilla
 
PDF
A NEW RANKING ON HEXAGONAL FUZZY NUMBER
Wireilla
 
DOUBT INTUITIONISTIC FUZZY IDEALS IN BCK/BCI-ALGEBRAS
Wireilla
 
CUBIC STRUCTURES OF MEDIAL IDEAL ON BCI -ALGEBRAS
Wireilla
 
α -ANTI FUZZY NEW IDEAL OF PUALGEBRA
Wireilla
 
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
Wireilla
 
DESIGN OF OBSERVER BASED QUASI DECENTRALIZED FUZZY LOAD FREQUENCY CONTROLLER ...
Wireilla
 
International Journal of Fuzzy Logic Systems (IJFLS)
Wireilla
 
EFFECTIVE REDIRECTING OF THE MOBILE ROBOT IN A MESSED ENVIRONMENT BASED ON TH...
Wireilla
 
APPROXIMATE CONTROLLABILITY RESULTS FOR IMPULSIVE LINEAR FUZZY STOCHASTIC DIF...
Wireilla
 
COMPARISON OF DIFFERENT APPROXIMATIONS OF FUZZY NUMBERS
Wireilla
 
A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...
Wireilla
 
STABILITY ENHANCEMENT OF POWER SYSTEM USING TYPE-2 FUZZY LOGIC POWER SYSTEM S...
Wireilla
 
STATISTICAL ANALYSIS OF FUZZY LINEAR REGRESSION MODEL BASED ON DIFFERENT DIST...
Wireilla
 
FUZZY LOAD FREQUENCY CONTROLLER IN DEREGULATED POWER ENVIRONMENT BY PRINCIPAL...
Wireilla
 
FUZZY LOGIC CONTROL OF A HYBRID ENERGY STORAGE MODULE FOR NAVAL PULSED POWER ...
Wireilla
 
A COUNTEREXAMPLE TO THE FORWARD RECURSION IN FUZZY CRITICAL PATH ANALYSIS UND...
Wireilla
 
IMPLEMENTATION OF FUZZY CONTROLLED PHOTO VOLTAIC FED DYNAMIC VOLTAGE RESTORER...
Wireilla
 
FUZZY CLUSTERING BASED SEGMENTATION OF VERTEBRAE IN T1-WEIGHTED SPINAL MR IMA...
Wireilla
 
OPTIMAL ALTERNATIVE SELECTION USING MOORA IN INDUSTRIAL SECTOR - A REVIEW
Wireilla
 
WAVELET- FUZZY BASED MULTI TERMINAL TRANSMISSION SYSTEM PROTECTION SCHEME IN ...
Wireilla
 
A NEW RANKING ON HEXAGONAL FUZZY NUMBER
Wireilla
 
Ad

Recently uploaded (20)

PPTX
Shinkawa Proposal to meet Vibration API670.pptx
AchmadBashori2
 
PDF
Electrical Engineer operation Supervisor
ssaruntatapower143
 
PPTX
Mechanical Design of shell and tube heat exchangers as per ASME Sec VIII Divi...
shahveer210504
 
PDF
Halide Perovskites’ Multifunctional Properties: Coordination Engineering, Coo...
TaameBerhe2
 
PPTX
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PPT
Carmon_Remote Sensing GIS by Mahesh kumar
DhananjayM6
 
PDF
Design Thinking basics for Engineers.pdf
CMR University
 
PPTX
Solar Thermal Energy System Seminar.pptx
Gpc Purapuza
 
PPT
PPT2_Metal formingMECHANICALENGINEEIRNG .ppt
Praveen Kumar
 
PPTX
Element 11. ELECTRICITY safety and hazards
merrandomohandas
 
PPTX
What is Shot Peening | Shot Peening is a Surface Treatment Process
Vibra Finish
 
PPTX
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
PDF
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
PPTX
Introduction to Design of Machine Elements
PradeepKumarS27
 
DOC
MRRS Strength and Durability of Concrete
CivilMythili
 
DOCX
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
PPTX
Thermal runway and thermal stability.pptx
godow93766
 
PDF
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
PPTX
Evaluation and thermal analysis of shell and tube heat exchanger as per requi...
shahveer210504
 
Shinkawa Proposal to meet Vibration API670.pptx
AchmadBashori2
 
Electrical Engineer operation Supervisor
ssaruntatapower143
 
Mechanical Design of shell and tube heat exchangers as per ASME Sec VIII Divi...
shahveer210504
 
Halide Perovskites’ Multifunctional Properties: Coordination Engineering, Coo...
TaameBerhe2
 
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
Carmon_Remote Sensing GIS by Mahesh kumar
DhananjayM6
 
Design Thinking basics for Engineers.pdf
CMR University
 
Solar Thermal Energy System Seminar.pptx
Gpc Purapuza
 
PPT2_Metal formingMECHANICALENGINEEIRNG .ppt
Praveen Kumar
 
Element 11. ELECTRICITY safety and hazards
merrandomohandas
 
What is Shot Peening | Shot Peening is a Surface Treatment Process
Vibra Finish
 
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
Introduction to Design of Machine Elements
PradeepKumarS27
 
MRRS Strength and Durability of Concrete
CivilMythili
 
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
Thermal runway and thermal stability.pptx
godow93766
 
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
Evaluation and thermal analysis of shell and tube heat exchanger as per requi...
shahveer210504
 

GENETIC ALGORITHM (GA) OPTIMIZATION USING DIABETES EXPERIMENTAL DATA

  • 1. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 DOI : 10.5121/ijfls.2017.7302 15 GENETIC ALGORITHM (GA) OPTIMIZATION USING DIABETES EXPERIMENTAL DATA Ejiofor C. I1 &Laud Charles Ochei2 Department of Computer Science, University of Port-Harcourt, Port-Harcourt. Nigeria 1 Robert Gordon University, Aberdeen, United Kingdom,2 Abstract Diabetes comprises of noisy features. This feature hampers classification and prediction for Artificial Intelligence (AI) system. The optimization of diabetes dataset using Genetic Algorithm (GA) exploring its fundamentals identifies the focus of this research. The dataset obtained from Biostat comprising of random samples (fifteen: 15) and parameter variables five: cholestrol, high-density lipoprotein, age, height and weight was used for the optimization. The simulation was Matrix Laboratory (MATLB). The optimized dataset was validated using standard optimization equation resulting in percentage score of Forty-one (41%) percent. This dataset will be using in classifying fuzzy system Keywords Genetic Algorithm, Diabetes. 1.INTRODUCTION Diabetes is a condition where the body fails to utilize the ingested glucose properly. Diabetes is caused by deficiency in insulin production or failure of the body to response to insulin production (Ananya, 2017). Excess glucose overtime within the blood stream can result in eye, kidney and nerve damage. It can also inflame heart disease, stroke and even amputation. Diabetes has been identified as the fastest growing long term disease that affects millions of people worldwide(MedlinePlus, 2017). The statisticindeed has been alarming across the globe. In 2013 it was estimated that over 382 million people worldwide were suffering from diabetes. In the United Kingdom, 2million persons have been identified as suffering from diabetes with 750, 000 unaware of their current illness. In the United States 25.8 million people or 8.3% of its population have been identified as diabetes sufferer with70 million remaining undiagnosed. In 2010, about 1.9 million new cases of diabetes were also identified in United State with a future prediction of 1 in every 3 Americans having the chance of experiencing diabetes by 2050 (Ananya, 2013). Diabetes can be categorized into: Type I and Type II. Type I diabetes is also known as insulin- dependent diabetes mellitus and is usually associated with younger adolescence.This form of diabetes is dueto the inability of the body to produce insulin as a result of autoimmune disorder destroying the pancreases and eliminating the chance of insulin production. Approximately 10% of diabetes diagnoses are associated with Type I.Type II diabetes; also known as adult-onset or noninsulin-dependent diabetes. This form of diabetes exists due to failure of the pancreas to produce enough insulin to metabolize glucose which is usually associated with aging.Approximately 90% of all cases of diabetes worldwide are associated with Type II (Medlinplus, 2017).
  • 2. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 16 Diabetes symptoms vary from frequent urination, intense thirst and hunger, weight gain, unusual weight loss, fatigue, male sexual dysfunction, numbness and tingling in hands and feet (MedlinPlus, 2017 and Ananya, 2017). Treatment and diagnosis of diabetes usually are complex with physician depending on patient’s symptoms in collaboration with Age, weight, height, family history and the contributing factor of alcoholism and lack of physical exercise in identifying type I or II diabetes. While the diagnoses have been consistent over time, the complication experienced by novel physician and the non- availability of experienced expert has fostered numerous Artificial Intelligence (AI) models. Although these model has been employed for the prediction of diabetes built to complement the conventional approaches of physician-patient interaction (Mehdi et al., 2012; Meysam and Mahdi, 2016)these models, possibly have suffer for inaccuracies due to noisy training sample which has hampered training cases This research paper explores genetic algorithm optimization technique for preprocessing diabetes datasetidentifying change variation within the dataset. The variation in change will be explored using standard optimization equation. 2.GENETIC ALGORITHM (GA) OVERVIEW Genetic Algorithm (GA) considered a search and optimization technique based on the adaptation of natural selection process and Evolutionary Algorithms (EA) has found its mark in Artificial intelligence problem domains. GA provides a framework for solving both constrained and unconstrained optimization problems based on a natural selection process which mimics biological evolution (Akbari, 2010). GA is used in arriving at optimal solutions; solutions directing the optimization to the best possible area (Eiben, 1994) through the modification of population of individual solutions. GA usually creates an optimization process using an initial population. This population encompasses solution seen as candidates with each candidate’s solution possess initial possible solutions. In each generation, the fitness of each individual are usually expressed and examined using the objective function created in most cases based on adaptation, user examine, experimental design and trial error (Son et al., 2016). The fittest individual are stochastically selected from the current population with each genome modified based on genetic operator. This modification creates new generation which are repeatedly evaluated. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population (Harik et al., 2006). GA population size depends largely on the problem domain. This population usually combined numerous possible solutions which are generated randomly forming the search or state space. These solutions are usually attuned toward better solutions (Taherdangkoo et al., 2012). Successive generation solutions are combined with preceding generation to breed a new generation. Individual solutions are selected through a fitness-based process, where fitter solutions are typically more likely to be selected. Novel solutions are produced using parent solution in breeding pool of successive generation by producing children solution (Coffin and Robert, 2008). The optimization process usually terminate based on minimum criteria satisfaction, allocated budget and highest fitness value (Echegoyen et al., 2012). Solution encoding, objective function identification, identified solution, application of operator and termination are indeed the fundamental components of GA (Coffin and Robert, 2008; Akbari, 2010 Echegoyen et al., 2012).
  • 3. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 17 3.METHODOLOGY:DIABETES DATASET OPTIMIZATION USING MATRIX LABORATORY (MATLAB) The dataset for simulation was obtained from Biostat. This data served as the experimental data for optimization. The dataset comprises of fifteen samples spread across five decisionvariables: cholestrol, high-density lipoprotein, age, height and weight. Table 3.1 depicts the diabetes simulation data. Table 3.1: Un-Optimized Diabetes Dataset
  • 4. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 18 3.1MATLAB GA SIMULATIONS Figure 3.1: GA-Simulation Chart The chart of Figure3.1 captures the fundamentals of diabetes GA simulation. It shows clearly that five input variable were utilized in collaboration with fifteen samples. The population type for simulation was explored as double while the function was uniformly created. The chart also depicts the weighted score for each column parameter input. The simulations chart also capture best fitness value, best individual value, genealogy, score and selection.
  • 5. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 19 Figure 3.2: Generation 1 (initial) GA-Simulation chart Figure 3.2 provides the first generation simulation identifying fitness value; with the best fitness value at generation one shown as 343and the over means of 15 samples identified as 506. The individual generation is identified as initial. The selection function picture the probability open to each sample in selecting prospecting children for mating within the next generation, with the fifth individual having the highest section probability of 6. The current best individual provides the best individual score per generation while the fitness score for each individual can be identified successively for each generation. For this generation an average fitness score less than 600 was initially maintained.
  • 6. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 20 Figure 3.3: Generation 15 (final) GA-Simulation chart Figure 3.3 provides the final generation simulation identifying fitness value; with the best fitness value at generation fifteen shown as 307 and the over means of 15 samples identified as 301. The individual generation is identified as fifteen. The selection function picture the probability open to each sample in selecting prospecting children for mating within the next generation, with the fifth individual having the highest section probability of 7. The current best individual provides the best individual score per generation while the fitness score for each individual can be identified successively for each generation. For this generation an average fitness score above 300 was maintained.
  • 7. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 21 Table 3.2: Optimized Diabetes Dataset Table 3.2 provides the change obtained succeeding optimization. The values cut across cholesterol, high density lipoporotein, Age, height and weight and showed the variation in data changed compared to previous data value appearing on table 3.1. 4.VALIDATION OF OPTIMIZED DIABETES DATASET Validation provides a definite proof in ascertaining the variation in change and determining percentage optimization. It measures how much the fundamental components have been optimized. Equation 4.1 determines these changes. Where R0= summation of fitness values of optimized dataset M0=summation of fitness values of the non-optimized dataset Table 4.2, captures the non–optimized and the optimized dataset the dataset are exemplified from case 1-10 and 110. Case Cholesterol High- Density Lipoprotein AGE Height Weight Weighted Score Status Confusion Matrix Case 1 78 12 67 67 119 343 Type II TP Case 2 78 12 67 67 119 343 Type II FP Case 3 78 36 27 67 119 327 Type II TP Case 4 78 12 40 59 121 310 Type II TP Case 5 78 12 36 67 119 312 Type II TP Case 6 79 13 27 62 119 300 Type II TN Case 7 78 12 27 66 119 302 Type II TN Case 8 78 12 67 66 119 342 Type II TN Case 9 78 12 45 62 119 316 Type II TP Case 10 78 12 37 69 119 315 Type II TP Case 11 78 12 27 67 119 303 Type II TP Case 12 78 12 29 64 119 302 Type II TN Case 13 78 12 34 67 119 310 Type II TN Case 14 78 12 36 59 119 303 Type II TP Case 15 78 12 29 63 119 300 Type II TN
  • 8. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 22 Table 4.3: Non- Optimized and Optimized fitness Values Degree of Optimization (DoO) = [4728 - 8017]/8017 = 0.4102 = 0.4102 * 100 = 41.1% The degree of optimization shows clearly that 41% variation change has occurred within the dataset that has been optimized. The graph of figure 4.1 graphical depicts this percentage change.This variation in change has improved the given dataset and subsequently eliminated noisy features.
  • 9. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 23 Figure 4.1: Degree of Change for Diabetes data The graph of figure 4.1 provides a percentage change of 40%. This change is perceived as the stall change value. The point at which change optimization on the same dataset could be stochastic possible as it run into infinite 5. CONCLUSION This research has established the usefulness of genetic Algorithm as an optimization tool in ascertaining optimal samples for used for prediction. The data obtained from biostat have been optimized using an appropriate objective function and associated fitness values. Matric laboratory interface provided simulation interfaces captured variation change within the dataset. The validation using standard optimization equation captured an optimization change of 41%. This dataset will be utilized probably in provided classification for fuzzy system. REFERENCES [1] Akbari Z. (2010). "A multilevel evolutionary algorithm for optimizing numerical functions" IJIEC 2 (2011): 419–430 [2] Ananya (2017), What is Diabetes, retrieved online from https://www.news-medical.net/health/What- is-Diabetes.aspx [3] Coffin, D.; S., Robert E. (2008). "Linkage Learning in Estimation of Distribution Algorithms". Linkage in Evolutionary Computation. Springer Berlin Heidelberg: 141–156. doi:10.1007/978-3-540- 85068-7_7. [4] Eiben, A. E. et al (1994). Genetic algorithms with multi-parent recombination, PPSN III: Proceedings of the International Conference on Evolutionary Computation. The Third Conference on Parallel Problem Solving from Nature: 78–87. ISBN 3-540-58484-6.
  • 10. International Journal of Fuzzy Logic Systems(IJFLS) Vol.7, No.3, October 2017 24 [5] Echegoyen, C.; Mendiburu, A. Santana, R.; Lozano, J. A. (2012). "On the Taxonomy of Optimization Problems under Estimation of Distribution Algorithms". Evolutionary Computation. 21 (3): 471–495. ISSN 1063-6560. doi:10.1162/EVCO_a_00095. [6] Harik G. R.; Lobo, F. G.; Sastry, K. (2006), Linkage Learning via Probabilistic Modeling in the Extended Compact Genetic Algorithm (ECGA),Scalable Optimization via Probabilistic Modeling [7] Springer Berlin Heidelberg: 39–61. doi:10.1007/978-3-540-34954-9_3. [8] MedlinePlus (2017), Diabetes, retrieved online from http:// www.medlineplus.com [9] Mehdi K., Saeede E. and Jamshid P. (2012), Diagnosing Diabetes Type II Using a Soft Intelligent Binary Classification Model, Review of Bioinformatics and Biometrics (RBB) Volume 1 Issue 1, December 2012 9-23. [10] Meysam J., and Mahdi M. (2016), Comparison of Predictive Models for the Early Diagnosis of Diabetes, Kournal of Health Information Research, Vol 22(2), Pp.95-100 [11] Son D. D., Kazem A. and Romeo M. (2016), Maximsing Performance of Genetic Algorithm Solver in Matlab, Advance online publication:, Pp. 1-9 [12] Taherdangkoo, M.; Paziresh, M.; Yazdi, M.; Bagheri, M. H. (2012). An efficient algorithm for function optimization: modified stem cells algorithm, Central European Journal of Engineering. 3 (1): 36–50. doi:10.2478/s13531-012-0047-8.