SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2334
A SURVEY ON MACHINE LEARNING INTELLIGENCE TECHNIQUES FOR
MEDICAL DATASET CLASSSIFICATION
P.M. Benson Mansingh1, Dr. M. Yuvaraju2
1Assistant Professor, Dept of ECE, Sri Ramakrishna Institute of Technology, Coimbatore
2Assistant Professor, Dept of EEE, Anna University Regional Campus, Coimbatore
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - In this paper, a survey has been done on the
application of Artificial intelligence computing techniquesfor
diagnostic of disease by classifying the bio medical datasets.
Many Artificial Intelligence techniques were reviewed for
medical dataset classification. This Exploration assembles
typical work that shows how the Artificial Neural Network is
applied to the solution of different diagnostic disease with
classification. It also detects themethodsandthetechniquesof
ANN that are used frequently to solve the special problem
related to the medicaldataset classification. ExtremeLearning
Machine (ELM) is used in almost for learning the medical
datasets to the network. Similarly PSO is used to optimize the
attributes or the parameter of the datasets for classification.
Several diseases like Breast cancer, Heart disease,
Diabetes….etc using ANN approach are result in use of SVM
(Support vector Machine) and BP network.
Key Words: Medical datasets, Machine learning, Artificial
Intelligence (AI), Extreme learning Machine(ELM),Artificial
Neural Network (ANN)
1. INTRODUCTION
Artificial Intelligence (AI) is the science and engineering
making intelligent machines, especiallyintelligentprograms.
And diagnosis is followed withthedevelopmentofalgorithm
and techniques that are able to determine whether the
behaviour of a system is correct. The application of
computational or machine intelligence in medical diagnosis
is a new trend for medical dataset classification.
Classification system can help to minimizing possible errors
that can be done because of inexperienced experts. And also
provide medical dataset to be examined in shorter time and
in detail. One of the application areas of analyzing database
and medical dataset classification is automated diagnostic
systems.
The aims of these studies are assisting to doctors in making
diagnostic decision with a subject to assurethediagnosis aid
accurately. In medical field, many researchers applying
different technique to improve the accuracy for the given
data, by accurate values give after classification to know the
affected patient and improve the diagnosis. That the
classification efficiency is allowed through Sensitivity,
Specificity and Accuracy for classification functions. A good
classifier should give hundred percent results for all the
three.
2. MEDICAL DATASET
Medical classification, or medical coding, is the process of
transforming descriptions of medical diagnoses and
procedures into universal medical code numbers. The
diagnosesand procedureswithinthehealthcarerecord,such
as the transcription of the physician's notes, laboratory
results,radiologicresults,andothersourcesareusuallytaken
from a variety of source.
Medical classification systems are used for a variety of
applications in medicine, public health and medical
informatics, including: statistical analysis of diseases and
therapeutic actions reimbursement; e.g.,basedondiagnosis-
related groups knowledge- based and decision support
systems direct surveillance of epidemic or pandemic
outbreaks. The medical datasets are taken from the UCI
machine repository.
2.1 MEDICAL DATASET ANLAYSIS METHODS
Medical dataset analysis method is included with three
important steps. First step includes dataset pre processing,
second follows feature selection and final step include
classifying the dataset. This all described in following
sections.
2.2 DATASET PRE PROCESSING
Pre processing the thousands of medical datasets are
combined into one relational table, by pre processing it
delete the mismatched data and also it removes the
multivalued attribute. And it replaces the missing value by
its mean, median and its standard deviation (SD).
Table -1: Different medical datasets
LIST OF
DISEASES
Number of
Instances OF INSTANCES
Number of
Attributes OF ATTRIBUTES
Number
of
Classes O
Breast Cancer 699 10 2
Diabetes 1151 20 3
Lung Cancer 32 56 4
Heart 303 75 2
Hepatitis 155 19 2
Thyroid 7200 21 4
Breast Cancer 699 10 2
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2335
2.3 FEATURE SELECTION
After normalizing the datasets, Feature selection method is
used to get the most important features of the dataset. That
the method mentions the significant feature before going to
classification. Several methods are used for feature selection
are F-Score, Threshold fuzzy entropy, PCA, GDA etc.
F-SCORE
Featureselection byF-Scoreisusedtofindtheoptimalsubset
of input variable with the best feature by removing no
predictive information. That it gives the good accuracy value
for classifying the medical datasets. It follows some steps for
feature selection..
• All features are taken and calculated using the given
formula.
• Mean is calculated using the formula.
• Compare the mean value of the feature with the
original value.
• It measures the discrimination or relevant feature
values related to 2 different features.
THRESHOLD FUZZY ENTROPY BASED FEATURE
SELECTION
Feature is selected based on the Fuzzy C means Clustering
with three framework are followed
• Mean selection
• Half selection
• Neural network
2.4 PCA
PCA used to convert the set of observation of possibly
correlated variable into the set of linearly uncorrelated
values. And it reduces the dimension of the datasets with
minimal loss of information and selects the most relevant
feature. The reduction of feature dimension by extract the
sub set of feature that describe as the best feature and
evaluated with high accuracy.
2.5 GDA
General Discriminant Analysis is used as a linear analysis
model to the Discriminant analysis problem and also for
classification problem. By using GDA we can set a complex
model for the set of predictor variable.
FEATURE CLASSIFICATION
After the feature is extracted using the feature selection
techniques. The best feature is selected using the
classificationalgorithmslikeNeuralNetwork,SupportVector
Machine (SVM), K-NN (K-Nearest Neighborhood) , Decision
Tree. By using the following techniques the best feature are
selected and grouped into one and classifying the datasets
with high accurate of disease rate. That classification having
both the supervised and unsupervised learning algorithms.
Classification techniques forsupervised learning areCluster,
Fuzzy C Means etc... And unsupervised learning algorithms
are ANN, K-NN, Decision tree etc…
2.6 NEURAL NETWORK
Artificial Neural Network is a powerful tool for solving the
complex problem with linear input-output efficient
relationship. That neural network is based on connections
with human brain which havingNnumberofneurons.Neural
network are followed with three layers they are input layer,
output layer and hidden layer. Hidden layer is used to map
the input function to the output. Neural network having
following featuresitsupportseveralnetworkarchitecturefor
supervised and unsupervised learning, it uses parallel
computing for feature training process, dynamic network to
store the data. And it uses unsupervised learning to train the
new inputthatadjustsitselfcontinuously.Mathematicalrules
like learning and training functions are used to adjust the
weight and bias automatically.
Fig -1: Neural Network
2.7 SVM
Classification with SVM by finding the hyper plane that
increases the dimensions of two plane classes. The hyper
plane vector are defines support vectors.Maximizethewidth
of the margin to get the optimalhyperplane.SVMareallowed
with linear hyper plane, and then it have a unique global
value. It built a model of new sample into one class, making a
non probabilistic linear classifier.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2336
Fig -2: Support Vectors
2.8 K-NN
K-NN is the non linear parameter used for classification and
regression. In both, for classificationandregressiontheinput
are combined with K closest neighbor values of the feature
space. Classificationisfollowedaccordingtotheclassification
condition. The output is assigned by membership class. The
value of K is either Positive or small. They classify the object
or feature according to the value.
2.9 PERFORMANCE ACCURACY
The Diagnosis of the diseases is identified by medical
datasets with several feature selection methods and
classification techniques. The diagnosis accuracy may be
varied from one exact technique to another technique.
Different feature extraction techniques that combine with
the classification method provide better results. TheTable 3
shows that the various accuracy levels of feature extraction
and classification methods during the emotionidentification
process.
Table -2: Accuracy of Methods
3. CONCLUSION
Classification of medical dataset can be identified by
extracting the different kind of featurefromthedatasets. For
extracting the features from the datasets Fuzzy C Means
Clustering will give the highestaccuracy,afterpreprocessing
the signal it has to be smoothed and optimized for the
particular feature, using different optimization techniques
like SVM, PSO, etc.. After getting the optimized result apply
the PSO and Fuzzy Cognitive map it will provide the high
accuracy of classification. These classification of medical
dataset will used for diagnosis of disease in early stage.
ACKNOWLEDGEMENT
We would earnestly thank our Supervisor forall hisvaluable
comments and active support.
REFERENCES
[1] Kemal Polat *, Salih Gunes ”A new feature selection
method on classification of medical datasets: Kernel F-score
feature selection” in 2009 Elsevier.
[2] P. Jaganathan and R. Kuppuchamy, “A threshold
fuzzy entropy based feature selection for medical database
classification,” Computers in Biology and Medicine, vol. 43,
no. 12, pp. 2222– 2229, 2013.
[3] Peng Tao1, Huang Yi” A Method Based on Weighted
F-score and SVM for Feature Selection” in 2015 CCDC.
[4] H. Temurtas, N. Yumusak, and F. Temurtas, “A
comparative study on diabetes disease diagnosis using
neural networks, Expert Systems with Applications, vol. 36,
no. 4, pp. 8610–8615, 2009.
[5] K. Polat, S.G¨unes¸, andA.Arslan, “Acascade learning
systemfor classification of diabetes disease: generalized
discriminant analysis and least square support vector
machine,” Expert Systems with Applications, vol. 34, no. 1,
pp. 482–487, 2008.
[6] M. F.Akay,“Supportvector machinescombinedwith
feature selection for breast cancer diagnosis,” Expert
Systems with Applications, vol. 36, no. 2, pp. 3240– 3247,
2009.
[7] S. Salcedo-Sanz, A. Pastor-S´anchez, L. Prieto, A.
Blanco- Aguilera, and RGarc´ıa-Herrera,“Featureselection in
wind speed prediction systems based on a hybridcoral reefs
optimization—extreme learningmachineapproach,”Energy
Conversion and Management, vol. 87, pp. 10–18, 2014
S. NO SELECTION
METHOD
CLASSIFICATION
ALGORITHM
ACCURACY CITIATION
1 Kernel F-
Score
SVM 76.03% [1]
2 Fuzzy
entropy
K-NN 75.45% [7]
3 F –Score NN 85.90% [3]
4 Weighted F-
Score
RBF 79.12% [8]
5 PCA SVM 67.8% [3,4]
6 GDA NN 70.8% [5]
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2337
[8] J. Kennedy and R. C. Eberhart, “Particle swarm
optimization,” in Proceedings of the IEEE International
Conference on Neural Networks,vol.4,pp.1942–1948,IEEE,
Perth, Australia, December 1995.
[9] Kayaer, K., & Yıldırım, T. (2003). Medical diagnosis
on pima Indian diabetes using general regression neural
networks, artificial neural networks and neural information
processing(ICANN/ICONIP)(pp.181–184),Istanbul,Turkey,
June 26–29.
[10] Vapnik, V. (1995). The nature of statistical learning
theory. New York: Springer.
[11] Ster, B., & Dobnikar, A. (1996). Neural networks in
medical diagnosis: Comparison with other methods. In
Proceedings of the international conference on engineering
applications of neural networks (pp. 427–430).
[12] West, D.,Mangiameli,P.,Rampal,R.,&West,
V. (2005). Ensemble strategies for a medical diagnosis
decision support system: A breast cancer diagnosis
application. European Journal ofOperational Research(162),
532–551.
BIOGRAPHIES
P.M.Benson Mansingh completed
B.E (Electronics and
CommunicationEngineering)from
St.Peter’s engineering college in
2010. He received his Master’s
Degree in Network Engineering
from Anna University Regional
Campus, Coimbatore in 2014.
Currently he is pursuing his Ph.D
degree in ANNA UNIVERSITY,
Chennai and working as Assistant
Professor in Sri Ramakrishna
Institute of Technology.
Dr.M.Yuvaraju,AssistantProfessor,
Department of Electrical &
Electronics Engineering, Anna
University Regional Centre
Coimbatore. He has guided several
UG and PG students and currently
guiding 5 Ph.D Research Scholars
in Anna university, Chennai.
hoto

More Related Content

PDF
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET Journal
 
PDF
IRJET- GDPS - General Disease Prediction System
IRJET Journal
 
PDF
IRJET- Result on the Application for Multiple Disease Prediction from Symptom...
IRJET Journal
 
PDF
IRJET- Disease Prediction System
IRJET Journal
 
PDF
IRJET - Survey on Analysis of Breast Cancer Prediction
IRJET Journal
 
PDF
IRJET - Review on Classi?cation and Prediction of Dengue and Malaria Dise...
IRJET Journal
 
PDF
Survey on semi supervised classification methods and
eSAT Publishing House
 
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET Journal
 
IRJET- GDPS - General Disease Prediction System
IRJET Journal
 
IRJET- Result on the Application for Multiple Disease Prediction from Symptom...
IRJET Journal
 
IRJET- Disease Prediction System
IRJET Journal
 
IRJET - Survey on Analysis of Breast Cancer Prediction
IRJET Journal
 
IRJET - Review on Classi?cation and Prediction of Dengue and Malaria Dise...
IRJET Journal
 
Survey on semi supervised classification methods and
eSAT Publishing House
 

What's hot (19)

PDF
Survey on semi supervised classification methods and feature selection
eSAT Journals
 
PDF
IRJET - Disease Detection in Plant using Machine Learning
IRJET Journal
 
PDF
IRJET- Plant Leaf Disease Detection using Image Processing
IRJET Journal
 
PDF
IRJET - Alzheimer’s Detection Model Using Machine Learning
IRJET Journal
 
PDF
IRJET- Detection of Plant Leaf Diseases using Image Processing and Soft-C...
IRJET Journal
 
PDF
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET Journal
 
PDF
IRJET- Medical Data Mining
IRJET Journal
 
PPT
Detection of plant diseases
Muneesh Wari
 
PDF
A novel medical image segmentation and classification using combined feature ...
eSAT Journals
 
PDF
An Exploration on the Identification of Plant Leaf Diseases using Image Proce...
Tarun Kumar
 
PDF
IRJET- A New Hybrid Squirrel Search Algorithm and Invasive Weed Optimization ...
IRJET Journal
 
PDF
IRJET-Android Based Plant Disease Identification System using Feature Extract...
IRJET Journal
 
PDF
A Novel Machine Learning Based Approach for Detection and Classification of S...
IRJET Journal
 
PDF
IRJET- An Expert System for Plant Disease Diagnosis by using Neural Network
IRJET Journal
 
PDF
Identification of Disease in Leaves using Genetic Algorithm
ijtsrd
 
PDF
ICU Patient Deterioration Prediction : A Data-Mining Approach
csandit
 
PDF
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
Tarun Kumar
 
PDF
A comprehensive study on disease risk predictions in machine learning
IJECEIAES
 
PDF
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
IRJET Journal
 
Survey on semi supervised classification methods and feature selection
eSAT Journals
 
IRJET - Disease Detection in Plant using Machine Learning
IRJET Journal
 
IRJET- Plant Leaf Disease Detection using Image Processing
IRJET Journal
 
IRJET - Alzheimer’s Detection Model Using Machine Learning
IRJET Journal
 
IRJET- Detection of Plant Leaf Diseases using Image Processing and Soft-C...
IRJET Journal
 
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET Journal
 
IRJET- Medical Data Mining
IRJET Journal
 
Detection of plant diseases
Muneesh Wari
 
A novel medical image segmentation and classification using combined feature ...
eSAT Journals
 
An Exploration on the Identification of Plant Leaf Diseases using Image Proce...
Tarun Kumar
 
IRJET- A New Hybrid Squirrel Search Algorithm and Invasive Weed Optimization ...
IRJET Journal
 
IRJET-Android Based Plant Disease Identification System using Feature Extract...
IRJET Journal
 
A Novel Machine Learning Based Approach for Detection and Classification of S...
IRJET Journal
 
IRJET- An Expert System for Plant Disease Diagnosis by using Neural Network
IRJET Journal
 
Identification of Disease in Leaves using Genetic Algorithm
ijtsrd
 
ICU Patient Deterioration Prediction : A Data-Mining Approach
csandit
 
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
Tarun Kumar
 
A comprehensive study on disease risk predictions in machine learning
IJECEIAES
 
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
IRJET Journal
 
Ad

Similar to IRJET - A Survey on Machine Learning Intelligence Techniques for Medical Dataset Classsification (20)

PDF
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AIRCC Publishing Corporation
 
PDF
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
ijcsit
 
PDF
A comparative analysis of classification techniques on medical data sets
eSAT Publishing House
 
PDF
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
IRJET Journal
 
PDF
IRJET- Disease Prediction using Machine Learning
IRJET Journal
 
PDF
HEALTH PREDICTION ANALYSIS USING DATA MINING
Ashish Salve
 
PDF
Predicting disease from several symptoms using machine learning approach.
IRJET Journal
 
PDF
DISEASE PREDICTION SYSTEM USING SYMPTOMS
IRJET Journal
 
PDF
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
PDF
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
PDF
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
DOC
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
ahmad abdelhafeez
 
PDF
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
PDF
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
PDF
Propose a Enhanced Framework for Prediction of Heart Disease
IJERA Editor
 
PDF
IRJET- A Data Mining with Big Data Disease Prediction
IRJET Journal
 
PDF
IRJET- Prediction of Heart Disease using RNN Algorithm
IRJET Journal
 
PDF
Classification of Heart Diseases Patients using Data Mining Techniques
Lovely Professional University
 
PDF
IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...
IRJET Journal
 
PDF
ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACH
cscpconf
 
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AIRCC Publishing Corporation
 
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
ijcsit
 
A comparative analysis of classification techniques on medical data sets
eSAT Publishing House
 
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
IRJET Journal
 
IRJET- Disease Prediction using Machine Learning
IRJET Journal
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
Ashish Salve
 
Predicting disease from several symptoms using machine learning approach.
IRJET Journal
 
DISEASE PREDICTION SYSTEM USING SYMPTOMS
IRJET Journal
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
ahmad abdelhafeez
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
Propose a Enhanced Framework for Prediction of Heart Disease
IJERA Editor
 
IRJET- A Data Mining with Big Data Disease Prediction
IRJET Journal
 
IRJET- Prediction of Heart Disease using RNN Algorithm
IRJET Journal
 
Classification of Heart Diseases Patients using Data Mining Techniques
Lovely Professional University
 
IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...
IRJET Journal
 
ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACH
cscpconf
 
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
PDF
Kiona – A Smart Society Automation Project
IRJET Journal
 
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
PDF
Breast Cancer Detection using Computer Vision
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
Kiona – A Smart Society Automation Project
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 

Recently uploaded (20)

PDF
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
hatem173148
 
PPT
Understanding the Key Components and Parts of a Drone System.ppt
Siva Reddy
 
PPTX
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
PPTX
Online Cab Booking and Management System.pptx
diptipaneri80
 
PPTX
MSME 4.0 Template idea hackathon pdf to understand
alaudeenaarish
 
PDF
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
PPTX
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
PPTX
sunil mishra pptmmmmmmmmmmmmmmmmmmmmmmmmm
singhamit111
 
PDF
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
PPT
1. SYSTEMS, ROLES, AND DEVELOPMENT METHODOLOGIES.ppt
zilow058
 
PDF
STUDY OF NOVEL CHANNEL MATERIALS USING III-V COMPOUNDS WITH VARIOUS GATE DIEL...
ijoejnl
 
PDF
All chapters of Strength of materials.ppt
girmabiniyam1234
 
PPTX
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 
PDF
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
PPTX
Information Retrieval and Extraction - Module 7
premSankar19
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
PPTX
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
PDF
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
PPTX
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
hatem173148
 
Understanding the Key Components and Parts of a Drone System.ppt
Siva Reddy
 
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
Online Cab Booking and Management System.pptx
diptipaneri80
 
MSME 4.0 Template idea hackathon pdf to understand
alaudeenaarish
 
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
sunil mishra pptmmmmmmmmmmmmmmmmmmmmmmmmm
singhamit111
 
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
1. SYSTEMS, ROLES, AND DEVELOPMENT METHODOLOGIES.ppt
zilow058
 
STUDY OF NOVEL CHANNEL MATERIALS USING III-V COMPOUNDS WITH VARIOUS GATE DIEL...
ijoejnl
 
All chapters of Strength of materials.ppt
girmabiniyam1234
 
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
Information Retrieval and Extraction - Module 7
premSankar19
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 

IRJET - A Survey on Machine Learning Intelligence Techniques for Medical Dataset Classsification

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2334 A SURVEY ON MACHINE LEARNING INTELLIGENCE TECHNIQUES FOR MEDICAL DATASET CLASSSIFICATION P.M. Benson Mansingh1, Dr. M. Yuvaraju2 1Assistant Professor, Dept of ECE, Sri Ramakrishna Institute of Technology, Coimbatore 2Assistant Professor, Dept of EEE, Anna University Regional Campus, Coimbatore ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - In this paper, a survey has been done on the application of Artificial intelligence computing techniquesfor diagnostic of disease by classifying the bio medical datasets. Many Artificial Intelligence techniques were reviewed for medical dataset classification. This Exploration assembles typical work that shows how the Artificial Neural Network is applied to the solution of different diagnostic disease with classification. It also detects themethodsandthetechniquesof ANN that are used frequently to solve the special problem related to the medicaldataset classification. ExtremeLearning Machine (ELM) is used in almost for learning the medical datasets to the network. Similarly PSO is used to optimize the attributes or the parameter of the datasets for classification. Several diseases like Breast cancer, Heart disease, Diabetes….etc using ANN approach are result in use of SVM (Support vector Machine) and BP network. Key Words: Medical datasets, Machine learning, Artificial Intelligence (AI), Extreme learning Machine(ELM),Artificial Neural Network (ANN) 1. INTRODUCTION Artificial Intelligence (AI) is the science and engineering making intelligent machines, especiallyintelligentprograms. And diagnosis is followed withthedevelopmentofalgorithm and techniques that are able to determine whether the behaviour of a system is correct. The application of computational or machine intelligence in medical diagnosis is a new trend for medical dataset classification. Classification system can help to minimizing possible errors that can be done because of inexperienced experts. And also provide medical dataset to be examined in shorter time and in detail. One of the application areas of analyzing database and medical dataset classification is automated diagnostic systems. The aims of these studies are assisting to doctors in making diagnostic decision with a subject to assurethediagnosis aid accurately. In medical field, many researchers applying different technique to improve the accuracy for the given data, by accurate values give after classification to know the affected patient and improve the diagnosis. That the classification efficiency is allowed through Sensitivity, Specificity and Accuracy for classification functions. A good classifier should give hundred percent results for all the three. 2. MEDICAL DATASET Medical classification, or medical coding, is the process of transforming descriptions of medical diagnoses and procedures into universal medical code numbers. The diagnosesand procedureswithinthehealthcarerecord,such as the transcription of the physician's notes, laboratory results,radiologicresults,andothersourcesareusuallytaken from a variety of source. Medical classification systems are used for a variety of applications in medicine, public health and medical informatics, including: statistical analysis of diseases and therapeutic actions reimbursement; e.g.,basedondiagnosis- related groups knowledge- based and decision support systems direct surveillance of epidemic or pandemic outbreaks. The medical datasets are taken from the UCI machine repository. 2.1 MEDICAL DATASET ANLAYSIS METHODS Medical dataset analysis method is included with three important steps. First step includes dataset pre processing, second follows feature selection and final step include classifying the dataset. This all described in following sections. 2.2 DATASET PRE PROCESSING Pre processing the thousands of medical datasets are combined into one relational table, by pre processing it delete the mismatched data and also it removes the multivalued attribute. And it replaces the missing value by its mean, median and its standard deviation (SD). Table -1: Different medical datasets LIST OF DISEASES Number of Instances OF INSTANCES Number of Attributes OF ATTRIBUTES Number of Classes O Breast Cancer 699 10 2 Diabetes 1151 20 3 Lung Cancer 32 56 4 Heart 303 75 2 Hepatitis 155 19 2 Thyroid 7200 21 4 Breast Cancer 699 10 2
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2335 2.3 FEATURE SELECTION After normalizing the datasets, Feature selection method is used to get the most important features of the dataset. That the method mentions the significant feature before going to classification. Several methods are used for feature selection are F-Score, Threshold fuzzy entropy, PCA, GDA etc. F-SCORE Featureselection byF-Scoreisusedtofindtheoptimalsubset of input variable with the best feature by removing no predictive information. That it gives the good accuracy value for classifying the medical datasets. It follows some steps for feature selection.. • All features are taken and calculated using the given formula. • Mean is calculated using the formula. • Compare the mean value of the feature with the original value. • It measures the discrimination or relevant feature values related to 2 different features. THRESHOLD FUZZY ENTROPY BASED FEATURE SELECTION Feature is selected based on the Fuzzy C means Clustering with three framework are followed • Mean selection • Half selection • Neural network 2.4 PCA PCA used to convert the set of observation of possibly correlated variable into the set of linearly uncorrelated values. And it reduces the dimension of the datasets with minimal loss of information and selects the most relevant feature. The reduction of feature dimension by extract the sub set of feature that describe as the best feature and evaluated with high accuracy. 2.5 GDA General Discriminant Analysis is used as a linear analysis model to the Discriminant analysis problem and also for classification problem. By using GDA we can set a complex model for the set of predictor variable. FEATURE CLASSIFICATION After the feature is extracted using the feature selection techniques. The best feature is selected using the classificationalgorithmslikeNeuralNetwork,SupportVector Machine (SVM), K-NN (K-Nearest Neighborhood) , Decision Tree. By using the following techniques the best feature are selected and grouped into one and classifying the datasets with high accurate of disease rate. That classification having both the supervised and unsupervised learning algorithms. Classification techniques forsupervised learning areCluster, Fuzzy C Means etc... And unsupervised learning algorithms are ANN, K-NN, Decision tree etc… 2.6 NEURAL NETWORK Artificial Neural Network is a powerful tool for solving the complex problem with linear input-output efficient relationship. That neural network is based on connections with human brain which havingNnumberofneurons.Neural network are followed with three layers they are input layer, output layer and hidden layer. Hidden layer is used to map the input function to the output. Neural network having following featuresitsupportseveralnetworkarchitecturefor supervised and unsupervised learning, it uses parallel computing for feature training process, dynamic network to store the data. And it uses unsupervised learning to train the new inputthatadjustsitselfcontinuously.Mathematicalrules like learning and training functions are used to adjust the weight and bias automatically. Fig -1: Neural Network 2.7 SVM Classification with SVM by finding the hyper plane that increases the dimensions of two plane classes. The hyper plane vector are defines support vectors.Maximizethewidth of the margin to get the optimalhyperplane.SVMareallowed with linear hyper plane, and then it have a unique global value. It built a model of new sample into one class, making a non probabilistic linear classifier.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2336 Fig -2: Support Vectors 2.8 K-NN K-NN is the non linear parameter used for classification and regression. In both, for classificationandregressiontheinput are combined with K closest neighbor values of the feature space. Classificationisfollowedaccordingtotheclassification condition. The output is assigned by membership class. The value of K is either Positive or small. They classify the object or feature according to the value. 2.9 PERFORMANCE ACCURACY The Diagnosis of the diseases is identified by medical datasets with several feature selection methods and classification techniques. The diagnosis accuracy may be varied from one exact technique to another technique. Different feature extraction techniques that combine with the classification method provide better results. TheTable 3 shows that the various accuracy levels of feature extraction and classification methods during the emotionidentification process. Table -2: Accuracy of Methods 3. CONCLUSION Classification of medical dataset can be identified by extracting the different kind of featurefromthedatasets. For extracting the features from the datasets Fuzzy C Means Clustering will give the highestaccuracy,afterpreprocessing the signal it has to be smoothed and optimized for the particular feature, using different optimization techniques like SVM, PSO, etc.. After getting the optimized result apply the PSO and Fuzzy Cognitive map it will provide the high accuracy of classification. These classification of medical dataset will used for diagnosis of disease in early stage. ACKNOWLEDGEMENT We would earnestly thank our Supervisor forall hisvaluable comments and active support. REFERENCES [1] Kemal Polat *, Salih Gunes ”A new feature selection method on classification of medical datasets: Kernel F-score feature selection” in 2009 Elsevier. [2] P. Jaganathan and R. Kuppuchamy, “A threshold fuzzy entropy based feature selection for medical database classification,” Computers in Biology and Medicine, vol. 43, no. 12, pp. 2222– 2229, 2013. [3] Peng Tao1, Huang Yi” A Method Based on Weighted F-score and SVM for Feature Selection” in 2015 CCDC. [4] H. Temurtas, N. Yumusak, and F. Temurtas, “A comparative study on diabetes disease diagnosis using neural networks, Expert Systems with Applications, vol. 36, no. 4, pp. 8610–8615, 2009. [5] K. Polat, S.G¨unes¸, andA.Arslan, “Acascade learning systemfor classification of diabetes disease: generalized discriminant analysis and least square support vector machine,” Expert Systems with Applications, vol. 34, no. 1, pp. 482–487, 2008. [6] M. F.Akay,“Supportvector machinescombinedwith feature selection for breast cancer diagnosis,” Expert Systems with Applications, vol. 36, no. 2, pp. 3240– 3247, 2009. [7] S. Salcedo-Sanz, A. Pastor-S´anchez, L. Prieto, A. Blanco- Aguilera, and RGarc´ıa-Herrera,“Featureselection in wind speed prediction systems based on a hybridcoral reefs optimization—extreme learningmachineapproach,”Energy Conversion and Management, vol. 87, pp. 10–18, 2014 S. NO SELECTION METHOD CLASSIFICATION ALGORITHM ACCURACY CITIATION 1 Kernel F- Score SVM 76.03% [1] 2 Fuzzy entropy K-NN 75.45% [7] 3 F –Score NN 85.90% [3] 4 Weighted F- Score RBF 79.12% [8] 5 PCA SVM 67.8% [3,4] 6 GDA NN 70.8% [5]
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2337 [8] J. Kennedy and R. C. Eberhart, “Particle swarm optimization,” in Proceedings of the IEEE International Conference on Neural Networks,vol.4,pp.1942–1948,IEEE, Perth, Australia, December 1995. [9] Kayaer, K., & Yıldırım, T. (2003). Medical diagnosis on pima Indian diabetes using general regression neural networks, artificial neural networks and neural information processing(ICANN/ICONIP)(pp.181–184),Istanbul,Turkey, June 26–29. [10] Vapnik, V. (1995). The nature of statistical learning theory. New York: Springer. [11] Ster, B., & Dobnikar, A. (1996). Neural networks in medical diagnosis: Comparison with other methods. In Proceedings of the international conference on engineering applications of neural networks (pp. 427–430). [12] West, D.,Mangiameli,P.,Rampal,R.,&West, V. (2005). Ensemble strategies for a medical diagnosis decision support system: A breast cancer diagnosis application. European Journal ofOperational Research(162), 532–551. BIOGRAPHIES P.M.Benson Mansingh completed B.E (Electronics and CommunicationEngineering)from St.Peter’s engineering college in 2010. He received his Master’s Degree in Network Engineering from Anna University Regional Campus, Coimbatore in 2014. Currently he is pursuing his Ph.D degree in ANNA UNIVERSITY, Chennai and working as Assistant Professor in Sri Ramakrishna Institute of Technology. Dr.M.Yuvaraju,AssistantProfessor, Department of Electrical & Electronics Engineering, Anna University Regional Centre Coimbatore. He has guided several UG and PG students and currently guiding 5 Ph.D Research Scholars in Anna university, Chennai. hoto