0% found this document useful (0 votes)

29 views22 pages

Artificial Intelligence Supported Remote Health Information System Development

This study presents an AI-supported remote health information system aimed at improving patient access to healthcare services through accurate medical specialty recommendations based on patient complaints. Various classification algorithms, including Multi-Layer Perceptron (MLP), were evaluated, with MLP achieving the highest accuracy of 86.13%. The research highlights the potential of AI-based referral systems to enhance clinical decision support and address challenges in Turkey's healthcare system.

Uploaded by

ali.cetinkaya8789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views22 pages

Artificial Intelligence Supported Remote Health Information System Development

Uploaded by

ali.cetinkaya8789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Konya Mühendislik Bilimleri Dergisi., c.**, s.

**, 202*
Konya Journal of Engineering Sciences v.**, n.**, 202*
ISSN: 2667-8055 (Electronic)

ARTIFICIAL INTELLIGENCE SUPPORTED REMOTE HEALTH INFORMATION

SYSTEM DEVELOPMENT

(Received: xx.xx.xxx; Accepted in Revised Form: xx.xx.xxxx)

ABSTRACT: The acceleration of digital transformation and the proliferation of telehealth applications
have become essential for the sustainability of healthcare systems and the improvement of service quality.
This study examines an advanced AI-based approach that aims to provide accurate medical specialty
recommendations based on patient complaints in healthcare settings. The model, developed to improve
patient guidance and accessibility to healthcare services, comparatively examines various classification
algorithms, including Logistic Regression, Multilayer Perceptron (MLP), Recurrent Neural Networks
(RNN), Random Forest (RF), Support Vector Machines (SVM), XGBoost, and LightGBM. Furthermore,
free-text patient complaints were analyzed using natural language processing (NLP) techniques and used
in the classification process. The model was tested with real patient data, and its integratability into
healthcare systems was evaluated. Performance evaluations were conducted using metrics such as
accuracy, precision, sensitivity, F-measure, and ROC curve, with the MLP model achieving the highest
accuracy rate of 86.13%. The results demonstrate that MLP classifier provides high accuracy in predicting
the correct specialty based on patient complaints. The study applies AI algorithms to provide accurate
expert recommendations based on free-text patient complaints using real-world data. This study
demonstrates that AI-based referral systems can be used as an effective tool in clinical decision support
processes.

Keywords: Telehealth, Digital Health, AI in Health, Medical Specialty Prediction, Clinical Decision Support
Systems.

1. INTRODUCTION

Telehealth services have become an increasingly important solution in recent years for patient
satisfaction and the sustainability of the healthcare system. The COVID-19 pandemic, in particular, has
significantly increased demand for remote healthcare services worldwide, and digital health solutions
have played a critical role in facilitating individuals' access to healthcare services and reducing the burden
on healthcare systems [1]. While significant steps have been taken towards the digitalization of the
healthcare system in Turkey, according to 2023 TÜİK data, the number of physicians per 1,000 people, at
2.18, falls short of the OECD average of 3.7 [2-3]. While this rate is between 4.2 and 4.3 in countries like
Germany and Switzerland, the limited access to specialist physicians, particularly in rural areas, in Turkey
further increases the need for telehealth services.
Data from the Ministry of Health for 2022 reveals that approximately 27% of doctor appointments in
Turkey are canceled without being used, and 69% of citizens cite “waiting time” as one of their biggest
health concerns [4]. International studies have reported satisfaction rates with telehealth services as high
as 83% for patients and 74% for healthcare professionals [5]. In the US, patient satisfaction in general
medical consultations conducted via telehealth was 84.4% and 94% in family medicine practices, with
physicians expressing satisfaction with these services between 66% and 83% [6]. The prevalence of
telehealth use varies across countries; according to the CDC’s 2021 National Health Interview Survey
(NHIS), 37% of adults used telemedicine (video and/or phone calls) at least once in the past 12 months in
2021 [7]. According to the subsequent 2022 NHIS report, this rate dropped to 30.1% [8].
Technological innovations and digitalization have fundamentally transformed the doctor-patient
relationship and the delivery of healthcare services, accelerating the transition to patient-centered
healthcare [9-11]. Online communication and health communities have emerged as new e-services that
enable remote and instantaneous interactions between doctors and patients [12-14]. Due to the impact of
the pandemic, telehealth applications have become widespread, and an increasing number of patients are
receiving contactless consultations through online platforms. E-health systems aim to reduce medical costs
while also improving the quality of healthcare and patient safety [15]. However, the large number of
medical specialties available for patients to choose from leads to information overload, which in turn
increases patient hesitancy [16].
Effective patient-doctor communication is known to be critical for improving health outcomes, as
misunderstandings during communication can lead to significant problems for both patients and
physicians [17-19]. However, the lack of interaction in patient portal applications and the accessibility of
appointment systems are fundamental challenges in the effective management of healthcare services. In
the healthcare sector, identifying appropriate specialties based on patient complaints, guiding patients,
and improving treatment processes are critical needs. To overcome issues such as patient information
overload and the unequal distribution of doctor visits, online health communities and medical consulting-
focused recommendations need to be developed [20].
Numerous studies have been conducted in the literature using medical social networks and large
datasets to facilitate patient access to healthcare and provide accurate specialty recommendations [21-24].
In light of these studies, it is crucial to conduct analyses that will improve patient access, referral, and
specialty recommendation processes.
This study utilizes various data analysis and artificial intelligence techniques to increase the
accessibility of healthcare services, facilitate patient referrals, and analyze healthcare data. The primary
objectives of the study include addressing the lack of interaction in patient portal applications, increasing
the accessibility of appointment services, reducing waiting times for physician appointments, improving
remote healthcare services, and ensuring the security of healthcare data. The AI-based models to be
developed are expected to improve patient access to and referral processes for healthcare services and
provide accurate specialist recommendations. Such technological solutions hold great potential for
making healthcare services more effective, accessible, and patient-centered.
While the use of AI and machine learning techniques in healthcare is increasing in the literature, the
development of models that make accurate specialist recommendations based on patient complaints
remains an important area of research. Existing studies generally focus on medical social networks and
big data analysis, but AI-based solutions that optimize healthcare access and referral processes in the
Turkish context are limited. This necessitates the development of models tailored to local needs. Machine
learning techniques, in particular, enable the development of models that can analyze patient complaint
data to make accurate specialty recommendations. This study, developed to improve patient guidance
and accessibility of healthcare services, comparatively examined various classification algorithms,
including Multi-Layer Perceptron (MLP), RNN, Random Forest (RF), Support Vector Machines (SVM),
XGBoost, and LightGBM. Patient complaint data is analyzed and specialist recommendation predictions
are made.
The unique aspect of this study is the development of an AI-powered model designed specifically for
Turkey's healthcare system dynamics and the practical solutions it offers to address interaction issues in
patient portal applications.
Consequently, this study provides a theoretical and practical contribution to increasing the
effectiveness of digital health applications both in Turkey and in countries with similar healthcare system
structures. It demonstrates that the development and integration of AI-based specialist recommendation
Artificial Intelligence Supported Remote Health Information System Development

models is a significant step in improving the accessibility and quality of healthcare services.
The structure of the study is as follows: The first section includes an introduction that outlines the
purpose, importance, and rationale of the topic. The second section includes a literature review of previous
research on the topic. This is followed by the third section, which describes the study's dataset, system
structure, and methods. The fourth section presents the research findings and the conclusions drawn from
these findings. The final section concludes the study

2. LITERATURE REVIEW

Today, the vast majority of data obtained in the healthcare field is in the form of unstructured text. In
this context, natural language processing (NLP) techniques play a critical role, particularly in tasks such
as patient complaint analysis. Proper data cleaning and preprocessing are crucial because they directly
impact model success. Jiang et al. [25] emphasized data preprocessing and quality control in healthcare
AI applications, stating that these steps form the basis of effective NLP applications. Vectorizing complaint
texts using the TF-IDF (Term Frequency-Inverse Document Frequency) method is widely used to make
data suitable for machine learning algorithms [26]. In the field of machine learning, algorithms such as
Decision Trees, Random Forests, Support Vector Machines (SVMs), Logistic Regression, Gradient
Boosting (GBMs), and LightGBMs have been effectively applied to classify health data and predict patient
risks [27-28]. Dimensionality reduction methods such as PCA (Principal Component Analysis),
particularly used on high-dimensional datasets, are preferred to reduce model complexity and optimize
computational costs [29]. PCA has been shown to improve model performance by eliminating unnecessary
or overlapping features in clinical data.
Deep learning models demonstrate superior performance, particularly on complex structures such as
clinical notes and patient histories containing time-series and sequential data. Recurrent neural networks
(RNNs) capture temporal dependencies in patient data, providing high accuracy in tasks such as
determining which outpatient clinic to refer [30-31]. Multi-Layer Perceptrons (MLPs), due to their
structure, are used as effective models for learning different features and in classification problems [32].
Wang et al. [33] integrated ensemble learning methods with TF-IDF for the analysis of patient complaint
texts and achieved significant success in predicting patient health risks. Similarly, [34] successfully
predicted patient hospitalization requirements using RNN-based models in sequential analysis of clinical
notes, demonstrating the power of deep learning in capturing complex patterns in healthcare data.
Consequently, the combined evaluation of both traditional machine learning and deep learning models is
crucial for improving data-driven decision-making in healthcare. While stages such as data cleaning and
feature engineering directly impact model performance, the combination of different algorithms and
dimensionality reduction methods contributes to the creation of more robust and generalizable models.
Marchiori et al. [35] described an AI-based triangulation system developed using approximately 1
million teleconsultation records. The system provides referrals via a mobile application by recommending
the appropriate point of care and timing using personalized questions based on patient symptoms. This
method is similarly aligned with the MLP model's approach to making outpatient clinic recommendations
based on patient complaints.
Mahyoub [36] demonstrated the optimization of the healthcare referral process in a simulation
environment using a machine learning-based model (Random Forest). It was noted that the model
reduced the delay times in the "post-discharge referral" process. This offers a similar approach for
evaluating healthcare referral processes. Gligorijević et al. [37] analyzed physician notes from emergency
department patients using both structured and textual data using an RNN and an attention-based model,
achieving 16% higher accuracy than human performance in source and referral predictions. This
demonstrates the crucial impact of using RNNs in analyzing complaint data. Girardi et al. [38]
demonstrated 79–85% accuracy for identified symptoms in a system they developed using an attention-
based CNN for triaging 600,000 medical notes. This study provides robust examples of the power of deep
learning methods in complaint/medical note analysis. [5] examined satisfaction levels with telehealth
services during the COVID-19 pandemic using a systematic review and meta-analysis. According to the
analysis results, patients reported 83% satisfaction with telehealth services, while healthcare professionals
reported 74%. These findings demonstrate a generally high level of acceptance of telehealth services for
both the recipient and provider.

3. MATERIAL AND METHODS

In this section, the outline of the study, model stages, data set, data preprocessing and descriptive data
analysis are explained.

3.1. Proposed Model

In this section, system overview is illustrated in the Figure 1.
• Data Cleaning and Preprocessing: Specialized methods were used to remove unnecessary
information from the dataset and fill in missing values. Additionally, preparations were made
for complaint analysis using natural language processing (NLP) techniques on text data.
• Data Vectorization: Complaint data was vectorized using the TF-IDF method, thus
transforming it into numerical data suitable for machine learning and deep learning models.
• Machine Learning Models: Various machine learning models (Decision Tree, Random Forest,
SVM, Logistic Regression, GBM, Light GBM) were used in the study, and their performance
was evaluated. Furthermore, the effects of data reduction methods such as PCA were
examined on the models.
• Deep Learning Models: Deep learning models such as RNN (Recurrent Neural Network) and
MLP (Multi-Layer Perceptron) were used, and success metrics were evaluated to predict
which outpatient clinic patients should visit.

Figure 1. The proposed research approach.

Artificial Intelligence Supported Remote Health Information System Development

3.2. Dataset

The dataset used in this study contains 2314 variables with dimensions (483570, 2314).
The dataset variables used in the study are:
• Date of birth,
• Gender,
• Protocol department name,
• Patient complaint
A comprehensive dataset, as specified in the dataset section, will be obtained from the medical records
of patients admitted to the internal medicine clinics of Medipol Mega University Hospital and Istanbul
Memorial Hospital from January 2023 to December 2024.
Data was collected from the hospital's HBYS system database. Data transfer was performed in the form
of anonymized data. Due to patient rights and privacy, patient health information and personal/identity
information were retrieved from the database in an anonymized form. However, no additional
information other than that necessary for the purpose of the study was retrieved from the database.

3.3. Random Forest (RF)

RF is an ensemble learning method created by combining many decision trees and was developed to
provide high accuracy and stability in classification problems. It is low-sensitivity to hyperparameter
settings and generally yields successful results without the need for extensive prior optimization [39]. The
RF model, first described by Breiman [40], is widely used in classification and regression problems. Instead
of relying on the decisions of a single decision tree, this model aims to minimize classification error by
combining multiple decision trees generated by randomly sampling the training dataset. The random
forest approach is based on the principle of randomly selecting both the training data and the independent
variables to be used in the splitting operations. In the model, the dataset is first divided into training and
test; Approximately two-thirds of the training data is then used to construct decision trees, while the
remaining data is reserved for evaluating the model's accuracy. Decision trees are constructed using
bootstrap sampling and the CART algorithm; furthermore, the trees are not pruned [41].

3.4. Gradient Boosted Tree (GBT)

Gradient Boosted Tree is essentially an optimized and accelerated version of the gradient boosting
algorithm. Popular implementations such as XGBoost include many optimizations such as parallel
computation and memory efficiency, providing high performance and accuracy, especially on large
datasets [42]. The GBT algorithm can operate directly on the raw data without requiring feature selection
or transformation. However, since all features and examples must be examined to determine the best
splitting points, significant computational and memory resource consumption can occur [43]. The gradient
boosting decision tree (GBDT), developed by Friedman, is the algorithm that forms the basis of XGBoost
[44].

3.5. Support Vector Machine (SVM)

SVM is a powerful machine learning method developed by Vapnik and Chervonenkis within the
framework of statistical learning theory [45]. Its most important feature in classification problems is that
it solves the problem by transforming it into an optimization problem. Furthermore, its ability to work
effectively with irregular and high-dimensional data sets is among the advantages of SVM [46]. In the
classification of complex data structures, it makes the data linearly separable into higher-dimensional
spaces with appropriate kernel functions. With this non-linear mapping, the data is transformed into the
hyperplane that provides the best separation [47]. Thus, classification is performed by determining the
decision boundaries that maximize the margin between classes. SVM can demonstrate high performance
on both linearly separable and non-separable data sets [48].

3.6. Logistic Regression

Logistic regression is a statistical method used specifically for classifying categorical dependent
variables. Unlike linear regression, it uses maximum likelihood estimation to model the relationship
between the dependent variable and independent variables. In cases where the dependent variable is not
continuous, LR, preferred as an alternative to linear regression, can be applied in different types
depending on the structure of the dependent variable: binary logistic regression, nominal logistic
regression, and ordinal logistic regression. This method performs classification by separating data
according to the categories of the dependent variable [49].

3.7. Decision Tree

Decision trees are one of the most widely used classification methods in data mining. Using an
inductive method, a tree-structured diagram is created to solve the problem using input and output data.
A decision tree consists of a root, branches, leaves, and the decision nodes connecting them. The first node
at the top of the tree that does not receive input is called the root node. Nodes following the root node
may have multiple internal nodes. Nodes at the end of the tree that do not branch are called leaf
(consequence) nodes. A decision tree starts from the root node and progresses downward to the leaves.
The branches of the constructed tree represent candidate paths for the completion of the classification
process. If the classification process does not occur at the end of a branch, a new decision node is created
at that branch; however, if a class is determined, it is recorded as a leaf node at the end of that branch.
Leaves represent one of the classes determined by the data [50].

3.8. LightGBM

The LightGBM model, developed by Microsoft, is optimized for working with large datasets and high-
dimensional data. LightGBM uses a histogram-based algorithm that provides advantages in terms of
speed and memory usage, and adopts a leaf-wise growth strategy to improve accuracy [51]. This approach
divides continuous data into G bins by keeping histograms of a certain width, which is efficient in terms
of memory and training time. The LightGBM model can be optimized by changing the tree growth
method; it prefers leaf-wise growth instead of the level-wise growth used by most tree-based learning
algorithms. Leaf-based growth ensures that the leaf that provides the most information gain and the least
loss is split is selected. The goal of the model is to determine an approximation function that minimizes
the loss function, expressed as:
"

𝑅" = 𝑎𝑟𝑔𝑚𝑖𝑛 * 𝐿(𝑦! , 𝑅(𝑥! ))

(1)
!#$

Here, N is the number of data points, 𝑥! is a specific data point, 𝑦! , is the label of data point 𝑥! , and
𝑅(𝑥! ), is the estimate for 𝑥! .

3.9. Deep Learning Methods

Artificial neural network models used in deep learning are designed with inspiration from the
functional structure of the human brain. Thanks to their learnable structures, they possess capabilities
such as perception, control, analysis, data storage, and inference from this data. These models perform a
mathematical learning process by creating their own features. While various artificial neural network
models exist depending on the application area, single-layer, multi-layer, feedforward, and feedback
artificial neural network models are widely used [52].
Artificial Intelligence Supported Remote Health Information System Development

3.9.1. Multilayer Perceptron (MLP)

MLP, one of the popular artificial neural network models, consists of an input layer, one or more
hidden layers, and an output layer [53-55]. Each layer consists of perceptron units called neurons, and
each interaction has a certain weight and bias. The input layer has n input variables 𝑋 = {𝑥$ , 𝑥% , … , 𝑥& }, and
the output layer has m output variables 𝑌 = {𝑦$ , 𝑦% , … , 𝑦& }.

The total number of parameters of an MLP can be found using [56]:

"!"#

𝑛. ℎ$ + * ℎ' . ℎ'($ + ℎ"! . 𝑛

(2)
'#$

when the number of hidden nodes of the ith layer is ℎ! , 𝑁) . As 𝑁) and ℎ' increase, longer computation
periods are needed to optimize an MLP.

3.9.2. Recurrent Neural Networks (RNN)

Recurrent Neural Networks (RNNs), developed by Jeffrey Elman, are learning and prediction-based
artificial neural network architectures that create a cyclical structure by using the outputs from previous
time steps as the next input, thus enabling the modeling of sequential relationships between sequential
data [57]. Thanks to this structural feature, information can be permanently stored within the network and
used when needed.
Unlike feedforward neural networks, RNNs can process the order and dependencies of input data over
time using their own internal memory. This allows for efficient modeling of relationships between
consecutive data. While each input sample is processed independently in traditional neural networks, in
RNN models, the output at each time step depends on the computations in previous time steps.
Figure 2 illustrates the working principle of the RNN algorithm. Here, at any time t, h represents the
activation, x represents the input, and y represents the output. The previous output h’ is also the next
input.

Figure 1. RNN algorithm

3.10. Model Performance Comparison

Metrics such as accuracy, recall, specificity, precision, F-score, kappa coefficient, and ROC (Receiver
Operating Characteristic) curve are frequently used to evaluate the performance of classification models.
These metrics, with the exception of the kappa coefficient and the ROC curve, are calculated based on a
confusion matrix that includes the true classes and the classes predicted by the model. This table, also
called the confusion matrix, enables detailed analysis of the classification results [58]. Commonly used
metrics for evaluating model performance in classification methods and their calculation formulas are
presented in Table 1. The confusion matrix is used as the basis for calculating these metrics.
Table 1. Performance criteria and formulas of classification
methods
Class Document Number
Sensitivity 𝑇𝑃
𝑇𝑃 + 𝐹𝑁
Specificity 𝐷𝑁
𝐷𝑁 + 𝑌𝑃
Accuracy 𝐷𝑃 + 𝐷𝑁
𝐷𝑃 + 𝑌𝑁 + 𝑇𝑁 + 𝑌𝑃
F-Score 2𝑥 Sensitivity x Specificity
Sensitivity + 𝑇𝑁 + 𝑌𝑃

4. RESULTS AND DISCUSSION

The analysis for this study was conducted using the Python programming language. Open-source
Python libraries are available for data science and machine learning. NumPy and pandas, matplotlib, and
seaborn were used for data visualization, and Scikit-learn, Tensorflow, Keras, and PyTorch were used for
data analysis and modeling.

4.1. Building a Stopwords Vocabulary

A vocabulary of approximately 1,800 words specific to the healthcare field was created to automatically
remove semantic words from the dataset. It is intended to help reduce the size and complexity of texts. It
is designed to be updatable, as regularly updated and enriched texts will yield more accurate and effective
results in the analysis and interpretation of healthcare texts.

4.2. Building the Zemberek Library

Zemberek is a Java-based library for Turkish natural language processing (NLP). Zemberek provides
comprehensive support for Turkish grammar, tokenization, and sentence extraction. The JPype module
was installed to load Zemberek's Python API. A Turkish grammar model, a Turkish tokenization model,
and a Turkish sentence extraction model were created using each model's default settings.

Figure 2. Samples from dataset

4.3. Data Preprocessing

The variables used in the study are as follows:

• Date of Birth
• Gender
• Protocol Department Name
• Patient Complaint
Artificial Intelligence Supported Remote Health Information System Development

Data preprocessing involved grouping the departments where patients were treated in the hospital
and deleting departments with low observation values or created by prompting.
Then, some values of this variable were combined with similar departments. For example, the
departments "İÇ HASTALIKLARI YOĞUN BAKIM" and "İÇ HASTALIKLARI" were combined as "İÇ
HASTALIKLARI" Finally, departments with less than 100 observation values or created by prompting
were deleted from the data set. These departments were deleted because they did not contain sufficient
data for analysis or could lead to statistically significant results. Deleting departments with low
observation values made the data set more consistent and reliable.
In this study, the K-Nearest Neighbor (KNN) method [59] was used to fill in missing (null) values in a
large data set. As a first step, the data set was reduced by randomly sampling 500 data points. Next, the
features to be used to fill the missing values were determined. Cross-validation was used to select the K
value, and the most appropriate K value was found.
Missing values were estimated using the KNN algorithm; for each missing value, the nearest neighbor,
up to the K value, was found to estimate the missing value, and the results were recorded. Upon
completion, RMSE (Root Mean Square Error) was used to assess the agreement of the estimated values
with the actual values. This method was successfully applied to fill the missing values, and the estimates
were found to be consistent with the dataset.

40731

36895
34222 33423
30922
28847
27545
24821
22233 22476
21551
19779
19588
16805
15819

12275
11352
963510283
8516
6957 6203
5884
4253 3871 3730
3423
2971 2561 2642
1068 230 1838 1024 775
Cardiology

Nephrology

Dermatology
Urology
Psychiatry

Pediatric surgery
Rheumatology

Neonatology

Infectious diseases

General Surgeon
Endocrinology
Eye Diseases

Neurology
Psychologist
ENT Diseases

Gastroenterology

Medical Oncology
Chest Diseases

Pediatric Cardiology

Internal medicine

Child neurology
Brain and neurosurgery

Anesthesia and reanimation

Neonatal Intensive Care

Cardiovascular Surgery

Radiation oncology
Hematology Polyclinic

Oral and Dental Health

Pediatric Immunology
Child health and diseases

Gynecology and Obstetrics

Nutrition and Dietetics

Orthopedics and traumatology

Pediatric Hematology and…

Physical therapy and…

Figure 3. Outpatient clinic distribution

4.4. Data Preprocessing by NLP (Natural Language Processing)

The Complaint variable in the dataset consists of the portion manually filled in by doctors. The dataset
was cleaned using Turkish text mining. The text mining operations were as follows:
• All uppercase letters in the text string in the Complaint variable were converted to lowercase.
• All numerical expressions were deleted.
• Unnecessary words were deleted using the Stopwords vocabulary.
• Single-letter words were deleted from the data in the Complaint variable.
• Word frequency was analyzed to identify misspelled words. The misspelled words were transferred
to the "Levenshtein Distance" module.
• Spelling errors were corrected. Rarely used words with low frequency were deleted through the
"Rare Words" module.
• The TurkishTokenizer class is used to segment Turkish text into sentences and words. The
TurkishMorphology class is used to analyze the structure of Turkish words (plural, possessive, verb
conjugation, etc.). The TurkishSentenceExtractor class is used to extract Turkish text into sentences. After
the first preprocessing step, data cleaning, the Şkâyet variable was parsed using the "Tokenization"
method. This was designed to create an infrastructure for identifying word roots. In Turkish NLP,
tokenization is the process of breaking a text into smaller pieces, or tokens. Tokens can consist of words,
numbers, punctuation marks, and other symbols. After these symbols are removed, the text is tokenized.

Figure 4. Word cloud of complaints before data pre-processing

Stemming was performed on all words separated into their tokens, a process called rooting. Stemming
was performed in the Zemberek library using the "Lemmatize" function. This function converts a word to
its root and also returns the possible meanings of the root. Words that cannot be stemmed are labeled
"UNK" and recorded in a separate database to determine if they are misspelled. The "Levenshtein
Distance" method was used to identify misspelled words within the labeled data.
Artificial Intelligence Supported Remote Health Information System Development

Figure 5. General status of the data set

•Levenshtein Distance Calculation Function

Levenshtein distance is used as a metric that measures the difference between two texts in terms of
character arrangements.
•Corrected Text Selection
For each Complaint text, the text with the lowest distance is selected by calculating the Levenshtein
distance of all other texts in the dataset. This step will help determine the correct words to use to correct
misspelled words.
• Applying Corrected Texts
The selected text with the lowest Levenshtein distance was used to replace the misspelled word in the
Complaint text. Thus, the misspelled words were automatically corrected.
• Evaluating the Results
Once the correction process was completed, the corrected texts were compared with the original texts
and the accuracy of the correction was evaluated.
Spelling correction using Levenshtein distance was an effective way to make text-based data more
consistent and meaningful.
After the spelling errors were corrected, the stemming process was repeated. When the root of some
words is removed during the stemming process, errors occur. To prevent this, words with negative
connotations are removed as they were during the stemming process. For example, when the root of the
word "digestion" is removed, the root is simply "digestion." If a negative suffix is present at the end, the
stemming process is not applied. This prevents incorrect stemming.

3.5. Data Vectorization

The TF-IDF (Term Frequency-Inverse Document Frequency) method was used to represent patient
complaints vectorially. In this method, the text data in the "Root_Complaint" variable was processed using
the TF-IDF (Term Frequency-Inverse Document Frequency) method. This method is used to determine
the importance of words in text documents. Here are the steps for processing data using the TF-IDF
method:
• Preparing the Data
First, the text data in the "Root_Complaint" variable was cleaned, and the necessary preprocessing
processes were repeated.
• Calculating Word Frequency
Next, the frequency of each word in the data needed to be calculated. This was done by dividing the
text data into words and counting the number of occurrences of each word.
• Calculating Frequency in Data
The frequency of each word in the data needed to be calculated. This process involved calculating the
number of different occurrences of each word in the data.
• Calculating TF-IDF values
Finally, the TF-IDF value of each word was calculated using the following formula.

TF-IDF = TF * log (N / DF) (3)

Here, TF is the word's frequency in the data. DF is the word's frequency in the data. N is the number
of data points.
• Data Vectorization
Finally, the data is vectorized by combining the TF-IDF values of each complaint into a vector. This
process is accomplished by creating new variables from the TF-IDF values of each complaint.

4.6. Data Visualization

Word clouds based on Protocol Section name of the data set generated after data preprocessing are
given below.

Figure 6. Word cloud of complaints after data pre-processing

Artificial Intelligence Supported Remote Health Information System Development

Figure 7. Age distribution by outpatient clinic

Figure 8. Gender distribution by polyclinic

Figure 9. Age distribution based on polyclinic and gender

Artificial Intelligence Supported Remote Health Information System Development

4.7. Application of Machine Learning and Deep Learning Models

4.7.1. The Effect of Data Reduction with PCA on Machine Learning Models

This study examines how using PCA (Principal Component Analysis), one of the data reduction
methods, affects the performance of machine learning models. PCA is a technique used to reduce the
dimensionality of multivariate datasets. This report was created to evaluate the effects of PCA on our
dataset and report its results.
The dataset contains 2314 variables with dimensions (483570, 2314). Using the PCA method, these
variables were reduced to 1100 principal components for dimensionality reduction. As a result, 94.3% of
the total variance was preserved. After applying PCA, predictions were performed using machine
learning models (Decision Tree, Random Forest, SVM, Logistic Regression, GBM, Light GBM). Models
before applying regular PCA were also tested on the same dataset.
According to the results, data reduction using the PCA method negatively impacted the performance
of the machine learning models. Models before applying regular PCA achieved higher accuracy or
performance metrics than models obtained after applying PCA. These results indicate that the original
structure of our dataset was better preserved due to the information lost with PCA, and the models
performed better on the original data.

Figure 10. PCA cumulative variance percentage

These analysis results indicate that PCA is not suitable for every dataset and may negatively impact
performance. Before performing data reduction, the structure of the dataset and the purpose of the
analysis were considered.
4.7.2. Application of Machine Learning Models

4.7.2.1. Exploratory Data Analysis

During the exploratory data analysis phase, a module was developed for exploratory analysis of the
data set. Categorical, numerical, and cardinal variables were analyzed based on the target variable. All
steps, including the separation of variable types, the examination of categorical variables, and the
examination of the target variable, were performed as modules.

4.7.2.2. Data Preprocessing and Feature Engineering

During the data preprocessing and variable engineering phases, the default values for calculating the
lower and upper "outlier thresholds" were set to q1 = 0.25 and q3 = 0.75. Values containing outliers were
checked and replaced with values calculated at the threshold. A "one-hot encoder" function was used for
newly created categorical variables. To use the one-hot encoder argument simultaneously as a "label
encoder" argument, the "drop first" argument was defined as "True." The two-class variable "Gender" was
formatted using the "one-hot encoder" function.

4.7.2.3. Base Models

Within the scope of the base model, the success evaluation steps were first defined using the hold-out
method, with the model training dataset representing 70% and the test dataset representing 30%, and the
model was fitted. Tree-based algorithms such as Decision Tree, Random Forest, SVM, Logistic Regression,
GBM, and Light GBM were used.

Figure 11. Performance of all models

Artificial Intelligence Supported Remote Health Information System Development

4.7.3. Application of Deep Learning Models

4.7.3.1. Application of RNN Model

In the healthcare sector, accurately determining which outpatient clinic patients should visit can help
them access healthcare quickly and effectively. In this context, a model has been developed that
recommends which outpatient clinic patients should visit based on their complaints.
The RNN (Recurrent Neural Network) model was first used in the use of Deep Learning models. This
study examines the performance of an artificial intelligence (AI) model developed to evaluate patient
complaints and predict which outpatient clinic they should visit. The initial model used was an RNN
(LSTM), but it failed to achieve the desired performance. This RNN-based model achieved an impressive
accuracy rate of 82%. This achievement represents a significant advancement in the field of healthcare
recommendation systems.

4.7.3.2. Application of MLP Model

We aim to provide a comprehensive analysis of our journey developing a prediction model based on
MLPClassifier (Multilayer Perceptron Classifier). We will cover the technical aspects of our methodology,
the neural network architecture, the extracted features, and the evaluation metrics used to measure the
performance of our RNN-based model compared to MLPClassifier. Through this rigorous research, we
aim to shed light on the capabilities of neural networks in healthcare recommendation systems and offer
a glimpse into the future of AI-driven healthcare decision-making.

Figure 12. Confusion matrix of multilayer perceptron model

In the healthcare industry, determining the appropriate medical specialty based on patient complaints
is a crucial challenge for improving patient guidance and treatment processes. In this context, artificial
intelligence and machine learning techniques enable the development of models that can make accurate
specialty recommendations to patients. This paper examines the performance of an MLP (Multi-Layer
Perceptron) classifier model on success metrics in predicting specialty recommendations by analyzing
patient complaints.

Figure 13. Performance evaluations of the multilayer perceptron model

• Accuracy: This measure represents the model's ability to predict the correct branch, expressed as a
percentage, and indicates that the model predicted the correct branch 86.13 percent of the time.
•Precision: This measure represents the proportion of the model's correct predictions, indicating that
86.14 percent of the branches the model suggested were actually correct.
• Recall: This measure represents how well the model predicted the true correct branches, indicating
that the model predicted 86.13 percent of all true correct branches.
• F1 Score: This measure represents the harmonic mean of the Precision and Recall metrics, and
indicates the model's balance of predicting the correct branch and recalling the true correct branches, with
a value of 84.77.

Figure 14. ROC curve of multilayer perceptron model

The results demonstrate that the developed MLP classifier model demonstrates high success in making
accurate specialist recommendations based on patient complaints. This model can play an important role
in healthcare settings, such as patient guidance and treatment improvement.
Artificial Intelligence Supported Remote Health Information System Development

4. CONCLUSIONS

This study comprehensively examined an artificial intelligence-powered classification system that

aims to provide accurate medical specialty recommendations based on patient complaints. Comparative
analyses of different machine learning algorithms (MLP, RF, SVM, XGBoost, LightGBM) were conducted,
and the practical applicability of the models was evaluated by testing them on real patient data.
Furthermore, meaningful features were extracted from free-text data using natural language processing
(NLP) techniques, strengthening the classification process.
The most striking result of the study was that the MLP model outperformed other models with an
accuracy rate of 86.13%. This finding demonstrates that MLP can be used as an effective decision support
tool in patient navigation systems. Furthermore, the study specifically addressed ethical, data privacy,
and security issues, and evaluated how the developed systems should be positioned within a sustainable
and secure healthcare infrastructure.
In this context, the research contributes both to the theoretical literature and provides a viable model
for practical healthcare solutions. Future studies can progress towards generalizing the model to different
patient profiles and healthcare systems, integrating multilingual text analyses, and improving the system
through user experience-focused evaluations.

Declaration of Ethical Standards

This study was granted by Istanbul Medipol University Non-Interventional Clinical Research Ethics
Committee (approval date: 26.12.2024, approval number: 1363).

Credit Authorship Contribution Statement

K.Ş: Researched and supplied experimental materials, wrote the article, and reviewed the manuscript,
AK and Ö.A: Conducted the experiments, wrote the article, and reviewed the manuscript.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships
that could have appeared to influence the work reported in this paper.

Funding / Acknowledgements

The authors have not disclosed any funding.

Data Availability

Data will be made available on request.

5. REFERENCES

[1] World Health Organization, “Global strategy on digital health 2020–2025: Monitoring and
evaluation framework,” 2022. [Online]. Available:
https://www.who.int/publications/i/item/9789240065550
[2] Türkiye İstatistik Kurumu (TÜİK), “Sağlık İstatistikleri Yıllığı 2023,” 2023. [Online]. Available:
https://data.tuik.gov.tr
[3] World Bank, “Physicians (per 1,000 people),” 2021.[Online].Available:
https://data.worldbank.org/indicator/SH.MED.PHYS.ZS
[4] T.C. Sağlık Bakanlığı, “2022 Sağlık İstatistikleri Yıllığı,” Sağlık Bilgi Sistemleri Genel Müdürlüğü,
2022. [Online]. Available: https://www.saglik.gov.tr
[5] L. Fadaizadeh, F. Velayati, and M. Arab-Zozani, “Satisfaction of patients and physicians with
telehealth services during the COVID-19 pandemic: A systematic review and meta-analysis,”
Healthcare Informatics Research, vol. 30, no. 3, pp. 206–223, 2024.
[6] A. C. Smith et al., “Telehealth for global emergencies: Implications for coronavirus disease 2019
(COVID-19),” J. Telemed. Telecare, vol. 26, pp. 309–313, 2020, doi: 10.1177/1357633X20916567.
[7] M. A. Villarroel and L. Dai, “Telemedicine use among adults: United States, 2021,” NCHS Data
Brief, no. 445, 2023. [Online]. Available: https://www.cdc.gov/nchs/products/databriefs/db445.htm
[8] J. W. Lucas and M. Villegas, “Telemedicine use among adults: United States, 2022,” Natl. Health
Stat. Rep., no. 205, 2024.
[9] V. Johansson et al., “Online communities as a driver for patient empowerment: systematic review,”
J. Med. Internet Res., vol. 23, p. e19910, 2021, doi: 10.2196/19910.
[10] M. Lemire, C. Sicotte, and G. Paré, “Internet use and the logics of personal empowerment in health,”
Health Policy, vol. 88, pp. 130–140, 2008, doi: 10.1016/j.healthpol.2008.03.006.
[11] R. S. Mano, “Social media and online health services: a health empowerment perspective to online
health information,” Comput. Human Behav., vol. 39, pp. 404–412, 2014, doi:
10.1016/j.chb.2014.07.032.
[12] T. Irizarry, D. A. DeVito, and C. R. Curran, “Patient portals and patient engagement: a state of the
science review,” J. Med. Internet Res., vol. 17, no. 6, p. e148, 2015.
[13] C. Jung et al., “Who are portal users vs. early e-Visit adopters? A preliminary analysis,” in Proc.
AMIA Annu. Symp., 2011, p. 1070.
[14] A. Mehrotra et al., “Characteristics of patients who seek care via eVisits instead of office visits,”
Telemed. e-Health, vol. 19, no. 7, pp. 515–519, 2013, doi: 10.1089/tmj.2012.0221.
[15] C. Cannaerts, “Models of/Models for Architecture,” in Proc. 27th eCAADe Conf., Istanbul, Turkey,
2009, pp. 781–786.
[16] W. Knoll and M. Hechinger, “Materials and Equipment,” in Architectural Models: Construction
Techniques, 2nd ed., Munich, Germany: J. Ross, 2006, pp. 24–38.
[17] D. A. Schön, “Designing: Rules, Types and Worlds,” Design Studies, vol. 9, no. 3, pp. 181–190, 1988,
doi: 10.1016/0142-694X(88)90015-4.
[18] R. T. Azuma, “Survey of augmented reality,” Presence: Teleoperators and Virtual Environments,
vol. 6, no. 4, pp. 355–385, 1997, doi: 10.1162/pres.1997.6.4.355.
[19] D. Wagner, “Handheld Augmented Reality Displays,” M.S. thesis, Graz Univ. of Technology,
Austria, 2008.
[20] S. Henderson and S. Feiner, “Evaluating the Benefits of Augmented Reality for Task Localization in
Maintenance of an Armored Personnel Carrier Turret,” in Proc. Int. Symp. Mixed and Augmented
Reality, Orlando, FL, USA, 2009, pp. 135–144, doi: 10.1109/ISMAR.2009.5336506.
[21] R. J. Hijmans and J. van Etten, “Raster: Geographic analysis and modeling with raster data,” R
Package Version 2.0-12, Jan. 12, 2012. [Online]. Available: http://CRAN.R-
project.org/package=raster
[22] L. M. R. Brooks, “Musical toothbrush with mirror,” U.S. Patent D326189, May 19, 1992. [Online].
Available: NEXIS Library: LEXPAT File: DES
Artificial Intelligence Supported Remote Health Information System Development

[23] J. O. Williams, “Narrow-band analyzer,” Ph.D. dissertation, Dept. Elect. Eng., Harvard Univ.,
Cambridge, MA, USA, 1993.
[24] F. Valdeira, S. Racković, V. Danalachi, Q. Han, and C. Soares, "Extreme multilabel classification for
specialist doctor recommendation with implicit feedback and limited patient metadata," arXiv
preprint arXiv:2308.10862, Aug. 2023.
[25] F. Jiang et al., "Artificial intelligence in healthcare: Past, present and future," Stroke and Vascular
Neurology, vol. 2, no. 4, pp. 230–243, 2017, doi: 10.1136/svn-2017-000101.
[26] Y. Wang et al., "Clinical information extraction applications: A literature review," Journal of
Biomedical Informatics, vol. 77, pp. 34–49, 2018, doi: 10.1016/j.jbi.2017.04.012.
[27] T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," in Proc. 22nd ACM SIGKDD
Int. Conf. Knowledge Discovery and Data Mining (KDD), 2016, pp. 785–794, doi:
10.1145/2939672.2939785.
[28] A. Rajkomar, J. Dean, and I. Kohane, "Machine learning in medicine," N. Engl. J. Med., vol. 380, no.
14, pp. 1347–1358, 2019, doi: 10.1056/NEJMra1814259.
[29] J. Lever, M. Krzywinski, and N. Altman, "Principal component analysis," Nature Methods, vol. 14,
no. 7, pp. 641–642, 2017, doi: 10.1038/nmeth.4346.
[30] Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzel, "Learning to diagnose with LSTM recurrent neural
networks," arXiv preprint arXiv:1511.03677, 2015.
[31] R. Miotto, L. Li, B. A. Kidd, and J. T. Dudley, "Deep patient: An unsupervised representation to
predict the future of patients from the electronic health records," Scientific Reports, vol. 6, Art. no.
26094, 2016, doi: 10.1038/srep26094.
[32] E. Choi, M. T. Bahadori, A. Schuetz, W. F. Stewart, and J. Sun, "Doctor AI: Predicting clinical events
via recurrent neural networks," in Proc. Mach. Learn. Healthcare Conf., 2017, pp. 301–318.
[33] Z. Wang, J. Chen, and Y. Zhang, "Patient risk prediction based on complaint texts using ensemble
machine learning models," IEEE Access, vol. 9, pp. 103123–103132, 2021, doi:
10.1109/ACCESS.2021.3099563.
[34] Y. Liu, P. H. C. Chen, J. Krause, and L. Peng, "How to read articles that use machine learning: Users’
guides to the medical literature," JAMA, vol. 323, no. 11, pp. 1043–1055, 2020, doi:
10.1001/jama.2020.13444.
[35] C. Marchiori et al., "Artificial intelligence decision support for medical triage," arXiv preprint
arXiv:2011.04548, 2020.
[36] [36] M. Mahyoub, "Integrating machine learning with discrete event simulation for improving
health referral processing in a care management setting," arXiv preprint arXiv:2206.12551, 2022.
[37] D. Gligorijević et al., "Deep attention model for triage of emergency department patients," arXiv
preprint arXiv:1804.03240, 2018.
[38] I. Girardi et al., "Patient risk assessment and warning symptom detection using deep attention-
based neural networks," arXiv preprint arXiv:1809.10804, 2018.
[39] X. Zhang, J. Zhang, J. Zhang, and Y. Zhang, "Research on the combined prediction model of
residential building energy consumption based on random forest and BP neural network,"
Geofluids, vol. 2021, Art. no. 7271383, 2021.
[40] L. Breiman, "Random forests," Machine Learning, vol. 45, pp. 5–32, 2001.
[41] M. Zekić-Sušac, A. Has, and M. Knežević, "Predicting energy cost of public buildings by artificial
neural networks, CART, and random forest," Neurocomputing, vol. 439, pp. 223–233, 2021.
[42] E. Duman, "Implementation of XGBoost method for healthcare fraud detection," Scientific Journal
of Mehmet Akif Ersoy University, vol. 5, no. 2, pp. 69–75, 2022.
[43] Y. Cao, G. Liu, J. Sun, D. P. Bavirisetti, and G. Xiao, "PSO-stacking improved ensemble model for
campus building energy consumption forecasting based on priority feature selection," Journal of
Building Engineering, vol. 72, Art. no. 106589, 2023.
[44] P. Arjunan, K. Poolla, and C. Miller, "EnergyStar++: Towards more accurate and explanatory
building energy benchmarking," Applied Energy, vol. 276, Art. no. 115413, 2020.
[45] A. Santolamazza, V. Cesarotti, and V. Introna, "Anomaly detection in energy consumption for
condition-based maintenance of compressed air generation systems: An approach based on
artificial neural networks," IFAC-PapersOnLine, vol. 51, no. 11, pp. 1131–1136, 2018.
[46] Z. J. Lee, Y. Lin, Z. Y. Chen, Z. Yang, W. G. Fang, and C. H. Lee, "Ensemble deep learning applied
to predict building energy consumption," in Proc. IEEE 6th Eurasian Conf. Educational Innovation
(ECEI), 2023, pp. 339–341.
[47] S. Mlangeni, A. E. Ezugwu, and H. Chiroma, "Deep learning model for forecasting institutional
building energy consumption," in Proc. 2020 Conf. Information Communications Technology and
Society (ICTAS), IEEE, 2020, pp. 1–8.
[48] S. Gunn, Support Vector Machines for Classification and Regression, Image Speech and Intelligent
Systems Technical Report, 1998, pp. 230–267.
[49] D. W. Hosmer, S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression, Hoboken, NJ, USA:
Wiley, 2013.
[50] H. M. Usman, R. ElShatshat, A. H. El-Hag, and R. A. Jabr, "Estimation of distribution transformer
kVA load using residential smart meter data," Electric Power Systems Research, vol. 204, Art. no.
107663, 2022.
[51] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, "LightGBM: A highly
efficient gradient boosting decision tree," Advances in Neural Information Processing Systems, vol.
30, 2017.
[52] K. Ozturk and M. Şahin, "Yapay Sinir Ağları ve Yapay Zekâ’ya Genel Bir Bakış," Tak. Vekayi, vol.
6, no. 2, pp. 25–36, 2018.
[53] J. F. Chen, Q. H. Do, and H. N. Hsieh, "Training artificial neural networks by a hybrid PSO-CS
algorithm," Algorithms, vol. 8, no. 2, pp. 292–308, 2015.
[54] A. A. Heidari, H. Faris, S. Mirjalili, I. Aljarah, and M. Mafarja, "Ant lion optimizer: Theory, literature
review, and application in multi-layer perceptron neural networks," in Nature-Inspired Optimizers:
Theories, Literature Reviews and Applications, pp. 23–46, 2020.
[55] Y. S. Park and S. Lek, "Artificial neural networks: Multilayer perceptron for ecological modeling,"
in Developments in Environmental Modelling, vol. 28, pp. 123–140, 2016.
[56] R. Qaddoura, A. Al-Zoubi, H. Faris, and I. Almomani, "A multi-layer classification approach for
intrusion detection in IoT networks based on deep learning," Sensors, vol. 21, no. 9, Art. no. 2987,
2021.
[57] E. Gavcar and H. M. Metin, "Hisse Senedi Değerlerinin Makine Öğrenimi (Derin Öğrenme) İle
Tahmini," Ekonomi ve Yönetim Araştırmaları Dergisi, vol. 10, no. 2, pp. 1–11, 2021.
[58] M. E. Balaban and E. Kartal, "Veri Madenciliği Süreci ve Futbol Maç Sonuçlarının Öngürülmesine
İlişkin Bir Uygulama," in R ile Veri Madenciliği Uygulamaları, İstanbul, Türkiye: Çağlayan
Kitabevi, 2016.
[59] N. S. Altman, "An introduction to kernel and nearest-neighbor nonparametric regression," The
American Statistician, vol. 46, no. 3, pp. 175–185, 1992.

AI-Driven Healthcare Delivery in Pakistan - A Framework For Systemic Improvement
No ratings yet
AI-Driven Healthcare Delivery in Pakistan - A Framework For Systemic Improvement
7 pages
Phase1 Presentation Epic013
No ratings yet
Phase1 Presentation Epic013
27 pages
Bodyguide
No ratings yet
Bodyguide
30 pages
Personalised Healthcare Web-Application
No ratings yet
Personalised Healthcare Web-Application
10 pages
A Deep Learning Enabled Chatbot Approach For Self-Diagnosis
No ratings yet
A Deep Learning Enabled Chatbot Approach For Self-Diagnosis
11 pages
Health-Bot: AI Chatbot for Rural Healthcare
No ratings yet
Health-Bot: AI Chatbot for Rural Healthcare
4 pages
Predictive - Analysis of Medicine Availability and Doctor Availability 2
No ratings yet
Predictive - Analysis of Medicine Availability and Doctor Availability 2
14 pages
No 21
No ratings yet
No 21
19 pages
12 X October 2024
No ratings yet
12 X October 2024
6 pages
AI, Pandemic and Healthcare 1st Edition Updated Edition Download
No ratings yet
AI, Pandemic and Healthcare 1st Edition Updated Edition Download
15 pages
Ieee Paper Group 8
No ratings yet
Ieee Paper Group 8
6 pages
Final Report Dbms
No ratings yet
Final Report Dbms
27 pages
AI Driven Symptom Analysis and Diseases Prediction Assistant For HealthCare
No ratings yet
AI Driven Symptom Analysis and Diseases Prediction Assistant For HealthCare
6 pages
Group 162
No ratings yet
Group 162
29 pages
2024 07 30 24311228v1 Full
No ratings yet
2024 07 30 24311228v1 Full
21 pages
An Overviewofthe Digitalizationof Healthcare AReviewof Related Literature
No ratings yet
An Overviewofthe Digitalizationof Healthcare AReviewof Related Literature
21 pages
Design and Implementation of A Smart Healthcare Re...
No ratings yet
Design and Implementation of A Smart Healthcare Re...
19 pages
AI in Smart Healthcare Systems
No ratings yet
AI in Smart Healthcare Systems
19 pages
AReviewonInnovation in Healthcare Sector (Telehealth)
No ratings yet
AReviewonInnovation in Healthcare Sector (Telehealth)
24 pages
Research Paper (ISE)
No ratings yet
Research Paper (ISE)
11 pages
Healthpulse Research Paper
No ratings yet
Healthpulse Research Paper
4 pages
(IJCST-V11I5P2) :MR U. Bhargav Kumar, SMT M. Prashanthi, SMT D. Madhuri
No ratings yet
(IJCST-V11I5P2) :MR U. Bhargav Kumar, SMT M. Prashanthi, SMT D. Madhuri
7 pages
Final Big Data Word
No ratings yet
Final Big Data Word
9 pages
AI T: A A D L - A V D S (VDS) : IN Elemedicine N Ppraisal On EEP Earning Based Pproaches To Irtual Iagnostic Olutions
No ratings yet
AI T: A A D L - A V D S (VDS) : IN Elemedicine N Ppraisal On EEP Earning Based Pproaches To Irtual Iagnostic Olutions
15 pages
AI Health Spectrum Project Proposal
No ratings yet
AI Health Spectrum Project Proposal
9 pages
BDCC 08 00180
No ratings yet
BDCC 08 00180
38 pages
JCSSP 2022 928 939
No ratings yet
JCSSP 2022 928 939
12 pages
New Paper For 26
No ratings yet
New Paper For 26
10 pages
An Overviewofthe Digitalizationof Healthcare AReviewof Related Literature
No ratings yet
An Overviewofthe Digitalizationof Healthcare AReviewof Related Literature
20 pages
ETI Microproject 2 by Campusify
No ratings yet
ETI Microproject 2 by Campusify
14 pages
Deep Learning - Unit 1 Notes
No ratings yet
Deep Learning - Unit 1 Notes
27 pages
NAVADEEP Report
No ratings yet
NAVADEEP Report
20 pages
CS Btech 5 Scheme 2023030924012129
No ratings yet
CS Btech 5 Scheme 2023030924012129
15 pages
_AI Design_ a Beginner's Guide - Gulli
No ratings yet
_AI Design_ a Beginner's Guide - Gulli
324 pages
Preethi Balamurugan Resume
No ratings yet
Preethi Balamurugan Resume
1 page
Computer Vision Algorithms and Applications 2nd Edition Richard Szeliski Full Chapters Included
100% (3)
Computer Vision Algorithms and Applications 2nd Edition Richard Szeliski Full Chapters Included
135 pages
A Comprehensive Survey of Text Classification Techniques and Their
No ratings yet
A Comprehensive Survey of Text Classification Techniques and Their
23 pages
LATIHAN SOAL Jaringan Syaraf Tiruan
No ratings yet
LATIHAN SOAL Jaringan Syaraf Tiruan
43 pages
Machine Learning Applications in Civil Engineering 1st Edition Kundan Meshram Instant Download
No ratings yet
Machine Learning Applications in Civil Engineering 1st Edition Kundan Meshram Instant Download
151 pages
Predctive Model For Cardiovascular Disease
No ratings yet
Predctive Model For Cardiovascular Disease
50 pages
DL Mod1.PDF Flashcards
No ratings yet
DL Mod1.PDF Flashcards
10 pages
Artificial Intelligence in Risk Management Within The Realm of Construction
No ratings yet
Artificial Intelligence in Risk Management Within The Realm of Construction
25 pages
3D Stacked IGZO 2T0C DRAM Array With Multibit CIM
No ratings yet
3D Stacked IGZO 2T0C DRAM Array With Multibit CIM
8 pages
10 36222-Ejt 1407231-3609588
No ratings yet
10 36222-Ejt 1407231-3609588
7 pages
Swarnandhra
No ratings yet
Swarnandhra
1 page
Final Report (B&W)
No ratings yet
Final Report (B&W)
31 pages
AI and Robotics Complete Practice Set Final
No ratings yet
AI and Robotics Complete Practice Set Final
12 pages
Function Approximation
No ratings yet
Function Approximation
3 pages
The Theory of Deep Learning Part 1 1757530586
No ratings yet
The Theory of Deep Learning Part 1 1757530586
76 pages
Oreily
No ratings yet
Oreily
4 pages
Deep Learning Syllabus R2019 CSE AIML VII
No ratings yet
Deep Learning Syllabus R2019 CSE AIML VII
7 pages
46-52 Sadgamaya Suman Thapaliya and Ayub Bokani
No ratings yet
46-52 Sadgamaya Suman Thapaliya and Ayub Bokani
7 pages
A Systematic Review On Hand Exoskeletons From The Mechatronics Aspect
No ratings yet
A Systematic Review On Hand Exoskeletons From The Mechatronics Aspect
19 pages
MF Performance Prediction
No ratings yet
MF Performance Prediction
17 pages
AI-Driven Structural Health Monitoring (SHM) For Civil Infrastructure
No ratings yet
AI-Driven Structural Health Monitoring (SHM) For Civil Infrastructure
7 pages
Proceedings of 2nd International Conference On Artificial Intelligence Advances and Applications ICAIAA 2021 Algorithms For Intelligent Systems Garima Mathur Available All Format
No ratings yet
Proceedings of 2nd International Conference On Artificial Intelligence Advances and Applications ICAIAA 2021 Algorithms For Intelligent Systems Garima Mathur Available All Format
162 pages
843 Ai Xii
No ratings yet
843 Ai Xii
12 pages
NCCCI
No ratings yet
NCCCI
16 pages
E Cient Representation of Quantum Many-Body States With Deep Neural Networks
No ratings yet
E Cient Representation of Quantum Many-Body States With Deep Neural Networks
6 pages
UNIT-4 Machine Learning
No ratings yet
UNIT-4 Machine Learning
20 pages

Artificial Intelligence Supported Remote Health Information System Development

Uploaded by

Artificial Intelligence Supported Remote Health Information System Development

Uploaded by

Konya Mühendislik Bilimleri Dergisi., c.**, s.

ARTIFICIAL INTELLIGENCE SUPPORTED REMOTE HEALTH INFORMATION

(Received: xx.xx.xxx; Accepted in Revised Form: xx.xx.xxxx)

3. MATERIAL AND METHODS

3.1. Proposed Model

Figure 1. The proposed research approach.

3.3. Random Forest (RF)

3.4. Gradient Boosted Tree (GBT)

3.5. Support Vector Machine (SVM)

3.6. Logistic Regression

3.7. Decision Tree

𝑅" = 𝑎𝑟𝑔𝑚𝑖𝑛 * 𝐿(𝑦! , 𝑅(𝑥! ))

3.9. Deep Learning Methods

3.9.1. Multilayer Perceptron (MLP)

The total number of parameters of an MLP can be found using [56]:

𝑛. ℎ$ + * ℎ' . ℎ'($ + ℎ"! . 𝑛

3.9.2. Recurrent Neural Networks (RNN)

Figure 1. RNN algorithm

3.10. Model Performance Comparison

4. RESULTS AND DISCUSSION

4.1. Building a Stopwords Vocabulary

4.2. Building the Zemberek Library

Figure 2. Samples from dataset

4.3. Data Preprocessing

The variables used in the study are as follows:

Anesthesia and reanimation

Neonatal Intensive Care

Oral and Dental Health

Gynecology and Obstetrics

Nutrition and Dietetics

Orthopedics and traumatology

Physical therapy and…

Figure 3. Outpatient clinic distribution

Figure 4. Word cloud of complaints before data pre-processing

Figure 5. General status of the data set

•Levenshtein Distance Calculation Function

3.5. Data Vectorization

TF-IDF = TF * log (N / DF) (3)

4.6. Data Visualization

Figure 6. Word cloud of complaints after data pre-processing

Figure 7. Age distribution by outpatient clinic

Figure 9. Age distribution based on polyclinic and gender

4.7. Application of Machine Learning and Deep Learning Models

Figure 10. PCA cumulative variance percentage

4.7.2.1. Exploratory Data Analysis

4.7.2.2. Data Preprocessing and Feature Engineering

4.7.2.3. Base Models

Figure 11. Performance of all models

4.7.3. Application of Deep Learning Models

4.7.3.1. Application of RNN Model

4.7.3.2. Application of MLP Model

Figure 12. Confusion matrix of multilayer perceptron model

Figure 13. Performance evaluations of the multilayer perceptron model

Figure 14. ROC curve of multilayer perceptron model

This study comprehensively examined an artificial intelligence-powered classification system that

Declaration of Ethical Standards

Credit Authorship Contribution Statement

Declaration of Competing Interest

The authors have not disclosed any funding.

Data will be made available on request.

You might also like