Background

Asthma is one of the most common chronic respiratory diseases and has a significant impact on sufferers and their families. The symptoms of asthma include coughing, wheezing, chest tightness and shortness of breath. These symptoms vary from person to person, ranging from mild to severe, and may occur frequently or infrequently. An asthma exacerbation occurs when these symptoms worsen [1, 2]. Asthma exacerbation in adolescents and children aged 6–11 years according to the 2024 Global Strategy for Asthma Management and Prevention guidelines is defined as follows: Exacerbations of asthma are episodes characterized by a progressive increase in symptoms of shortness of breath, cough, wheezing or chest tightness and progressive decrease in lung function. Exacerbations may occur in patients with a preexisting diagnosis of asthma or, occasionally, as the first presentation of asthma. Among other things, the guidelines defined asthma exacerbation in children under 5 years of age as an acute or sub-acute deterioration in symptom control that is sufficient to cause distress or risk to health and necessitates a visit to a healthcare provider or requires treatment with systemic corticosteroids [3]. These exacerbations often result in school absences, parental absenteeism, unplanned emergency department (ED) visits, and hospitalizations, severely affecting the health-related quality of life (HRQL) of children and their parents [4]. Despite interventions, nearly half of pediatric asthma patients experience exacerbations annually, with 1/6 requiring ED visits and 1/20 hospitalization. These visits account for over 1.8 million ED visits and more than 60% of the total cost of asthma care [5,6,7,8].

In recent years, the use of machine learning (ML) algorithms in pediatric asthma exacerbations has shown significant potential. ML algorithms can assist physicians in more accurately diagnosing asthma by analyzing clinical data, such as electronic health records (EHRs), and pulmonary function test results, as well as predicting the risk of asthma exacerbations by integrating multiple sources of data, including clinical indicators, environmental factors, and socioeconomic factors [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]. It also provides patients with more personalized and precise medical services through its unique ability to process and analyze large and complex datasets and capture highly nonlinear relationships and complex interactions in the data [32,33,34,35]. However, there is still a lack of systematic review on the applications of ML in pediatric asthma exacerbation management.

This study aims to comprehensively analyze the current research progress of ML techniques in pediatric asthma exacerbation management, identify the critical risk factors, evaluate the effectiveness of these techniques, and explore the potential applications of ML in the diagnosis and prediction of pediatric asthma exacerbations, personalized treatment, and long-term health management. The knowledge synthesis from this study may provide a scientific basis for clinical decision-making, policy formulation, and health education, potentially improving the quality and efficiency of care in the future.

Methods

Search strategy

This systematic review complies with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) standards. Institutional review board approval was not required, as publicly available data were used and no human subjects were involved. A comprehensive search was performed in PubMed, EBSCO, Elsevier, and Web of Science, covering the period from January 2000 to January 2025. Studies published before 2000 were ineligible since they were considered less relevant to modern asthma care [36]. The search strategy was centered around the terms “Asthma,” “Asthma exacerbation/attack/deterioration,” and “Machine learning/Deep learning” and included their appropriate derivatives and synonyms such as “Asthmas” OR “Bronchial Asthma” OR “Asthma, Bronchial”, “Learning, Machine” OR “Transfer Learning” OR “Learning, Transfer”, “Deep Learning” OR “Learning, Deep” OR “Hierarchical Learning” OR “Learning, Hierarchical”. Additionally, we examined the reference lists of included articles to identify any additional relevant studies not retrieved by the automatic searches. The whole search strategy [see Additional file 1].

Inclusion and exclusion criteria

Eligibility criteria for inclusion were: (1) The study subjects included children and/or adolescents under 18 years; (2)The study subjects had asthma exacerbation/attack/deterioration and/or asthma exacerbation/attack/deterioration included in the primary or secondary outcome; (3) The language of the article was in English; (4) Research or application of ML for asthma exacerbation/attack/deterioration was conducted; (5) Observational studies (including retrospective, prospective, cohort studies, case-control studies, etc.) and randomized controlled trials (RCT) were eligible.

We excluded the following: (1) Books or dissertations or thesis or conference abstracts or comments or patents or awarded grant or editorial, or case reports; (2) Systematic reviews or Meta-analysis; (3) Non-full text articles.

Screening and data extraction

Two authors (Chunni Zhou and Liu Shuai) independently scanned abstracts, titles, and citations retrieved by electronic and hand searches against the inclusion criteria to assess eligibility. Two review authors independently reviewed the full-text studies retrieved to determine final eligibility. Disagreements were discussed and resolved by consensus, and if necessary, a third author (Meng Li) was involved.

Data were extracted by one reviewer (Chunni Zhou) and checked for consistency by the other two reviewers (Liu Shuai and Meng Li). Data extracted included first author, country, year, study design, data collection period, study population, sample size, type of ML algorithm, definition of asthma exacerbation/attack/deterioration, outcome events, and study results. For the ML algorithms, we also extracted validation methods and performance metrics. In addition, we read through each study to generalize and categorize the research objectives. Due to significant methodological heterogeneity among the studies, a meta-analysis was not conducted. Instead, a narrative synthesis of the results was performed, and complete details of the included studies are reported in Table 1.

Table 1 Main characteristics of the identified studies reported in these studies

Quality assessment

The Effective Public Health Practice Project (EPHPP) quality assessment tool for quantitative studies [37] was used to assess each research regarding potential biases and global study quality. Studies were given a global rating of strong, moderate, or weak based on the score. The tool was used for removing confounders, blinding, intervention integrity, and analysis, as these were irrelevant to the study designs assessed in this review. This left the following areas: selection bias, study design, data collection methods, and withdrawals and dropouts. One author (Chunni Zhou) conducted this assessment, and a discussion was undertaken with the second author.

Results

Study selection

Figure 1 shows the study selection process. A total of 675 papers were identified from four databases (PubMed (136), Elsevier (47), Web of Science (333), and Ebsco (159)). After excluding 335 duplicates, 340 papers were screened by titles and abstracts, leading to 31 potentially eligible papers. Then, full-text screening of these articles confirmed eligibility for 16. Additionally, examining the references of these articles yielded 7 more, totaling 23 articles included in the review (Fig. 1).

Fig. 1
figure 1

Preferred reporting items for systematic reviews and meta-analyses (PRISMA) flow diagram

Study characteristics

The publication year of these papers they were ranged from 2006 to 2024. Eight studies were published between 2006 and 2015 [9,10,11,12,13,14,15, 28] and 15 between 2016 and 2024 [16,17,18,19,20,21,22,23,24,25,26,27, 29,30,31]. Among these studies, 14 studies were from America [9,10,11, 14, 16, 17, 19, 23,24,25,26,27, 29, 30], three were from the Netherlands [13, 15, 18], and one each was from Canada [12], Greece [20], Korea [21], Japan [22], China [28] and Poland [31] (Fig. 2).

Fig. 2
figure 2

Study characteristics of the included studies. Note (a) study setting of the included studies; (b) study design of the included studies; (c) study population of the included studies; (d) publication years of the included studies; (e) sample size of the included studies

Nine were prospective studies [9, 12,13,14,15, 17, 18, 21, 31], including two prospective cohort studies [17, 18] and two prospective longitudinal studies [13, 15], with follow-up times of one [13, 15, 18], and three years [17]. Thirteen were retrospective studies [10, 11, 16, 19, 20, 22, 24,25,26,27,28,29,30], including one retrospective cohort study [27] and one case-crossover study [16], with follow-up periods of two [27] and eleven years [16]. One study was a randomized controlled trial with a 1-year follow-up [23]. In the randomized controlled trial, the intervention positively affected exacerbation outcomes [23] (Fig. 2).

The age range for populations varied across the studies: five included children aged 2–18 years [9, 10, 14, 19, 30], three included children aged 0–17 years [23, 25, 31], two included children aged 5–18 years [26, 27], two included children aged 6–18 years [15, 18], one each included children aged six months to 15 years [22], 1–14.5 years [20], 1–17 years [12], 2–21 years [24], 5–12 years [11], 6–14 years [21], 6–16 years [13] and 6–17 years[29] and three did not specify the age of the pediatric participants [16, 17, 28] (Fig. 2).

The sample size varied from 14 to 54981 (eight studies ≤ 100 [13, 15, 17, 18, 20, 21, 28, 31], six studies ≤ 1,000 [11, 12, 14, 23, 25, 30], four studies ≤ 10,000 [9, 10, 24, 26], four studies > 10,000 [16, 19, 22, 27] and one study not explained [29]). Full details of the included studies are reported in Table 1 (Fig. 2).

Definition of exacerbation and outcome

Ten studies did not specify a definition of asthma exacerbation [10, 14, 16, 17, 20, 21, 24, 28,29,30,31], three studies defined asthma exacerbation according to the most recent ATS/ERS [13, 15, 18], three studies defined asthma exacerbation using emergency room visits and hospitalizations [11, 12, 27], two studies defined asthma exacerbation using emergency room visits/hospitalizations or oral corticosteroid [23, 25], two studies defined asthma exacerbation using International Classification of Diseases codes ICD9 or − 10 [22, 26], one study defined asthma exacerbation using concomitant receipt of albuterol and systemic corticosteroids [19], one study defined asthma exacerbation using only emergency room visits [9] and one study defined asthma exacerbation using emergency room visits or hospitalizations or outpatient visit with usage of oral corticosteroids medications [29]. Seventeen of the studies had a primary or secondary outcome of pediatric asthma exacerbation [9,10,11,12,13, 15,16,17,18, 20, 23, 25, 26, 28,29,30,31], while the remaining six studies included populations with pediatric asthma exacerbations [14, 19, 21, 22, 24, 27]. Therefore, the outcomes of the studies were mainly related to emergency room visits or hospitalizations [19, 24, 27], peak expiratory flow rate (PEFR) values [21], asthma control exacerbation [14], and antibiotic variants and adjunctive therapy [22].

ML model related characteristics

Regarding learning algorithms, 23 studies utilized 59 different ML algorithms, which were categorized with the most popular being Bayesian Networks (BN), followed by Random Forests (RF), Decision Trees (DT), Neural Networks (NN) and Support Vector Machines (SVM) (Fig.3).

Fig. 3
figure 3

Histogram of different ML algorithms for asthma exacerbation management. Note: There were a total of 59 models across the 23 studies, which were categorized to give a total of 12 categories of models

Because of data limitations, only five studies were included in this study to research the performance of their ML models, and the performance metrics were accuracy, sensitivity, specificity, Area Under the Curve (AUC), Positive Predictive Value (PPV) and Negative Predictive Value (NPV). Overall, most of the values of the six metrics of all the models in the five studies were high, especially the specificity, AUC and NPV were close to 100% for some models, and among all the metrics, the distribution of accuracy was the most concentrated, with most models around 70%, and the remaining metrics have a more dispersed distribution of values. The five models of Dexheimer JW et al [10] have higher values of sensitivity, specificity, AUC and NPV, among which the NPV of BN, Max-Min Hill-Climbing (MMHC) and Gaussian process (GP) reaches 98.9%, but the PPV of the five models are low, especially Artificial Neural Network (ANN), which has a PPV of only 38%. The values of the five models of Gardeux V et al [17] for the six metrics are relatively close to each other, with values between 60% and 70%. The accuracy, specificity, AUC and NPV of the six models of Luo G et al [14] were around 70% and 80%, but the sensitivity and PPV values were lower, with RF having the lowest sensitivity of 38.3%, showing a rectangular pattern (Fig. 4).

Fig. 4
figure 4

Radar charts of ML model performance metrics. Note: Because of data limitations, all five models studied by Dexheimer JW were missing accuracy, with the SVM also missing AUC; the NB model studied by Farion KJ was missing AUC; and the BN model studied by Sanders DL was missing accuracy. BN, Bayesian Network; MMHC, Max-Min Hill-Climbing; ANN, Artificial Neural Network; GP, Gaussian process; SVM, Support Vector Machine; NB, Naive Bayes model; RF, Random Forest; DT, Decision Tree; KNN, K- Nearest Neighbor; DS, Decision stumps; DNN, Deep Neural Network

Farion KJ et al [12] and Sanders DL et al [9] each included only one model. In the NB model of Farion KJ et al [12], the highest value of the five reported metrics was 85% (PPV), while the lowest value was 53% (NPV). In contrast, the Bayesian network model of Sanders DL et al [9] reported the highest value as 98.8%(NPV) and the lowest value as 44.7% (PPV).

Regarding algorithm validation methods, nine studies used cross-validation [9, 12,13,14, 19,20,21, 29, 31], six studies used split-sample validation [10, 15, 16, 24,25,26], two studies used bagging validation [18, 27], two studies used holdout validation [11, 17], and four studies did not mention validation methods [22, 23, 28, 30].

Of the 23 studies included, 21 studies dealt with classification tasks using ML models [9,10,11,12,13,14,15, 17,18,19,20,21, 23,24,25,26,27,28,29,30,31], only one study dealt with clustering tasks [22], and one study dealt with association analysis tasks using association rule mining [16].

The study showed that the number of variables entered into the model varied across studies, with 5 studies having a variable count of ≤ 10 variables [9, 18, 21, 22, 27], 13 studies having a variable count of ≤ 50 variables [10, 12, 14,15,16,17, 19, 20, 23,24,25, 28, 29], 1 study having a variable count of > 100 variables [13], and 4 studies having an unknown specific variable count [11, 26, 30, 31]. In the 23 studies, the types of variables input into the model were a mixture of numerical and categorical variables [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]. The numerical variables mainly included physiological indicators of patients (such as BMI, and pulmonary function indicators), environmental data, and genetic data. The categorical variables mainly included basic patient characteristics (such as gender, age, and ethnicity), clinical characteristics (such as allergic constitution, comorbid conditions, and asthma severity), treatment-related factors (such as medication use and treatment group), and other clinical diagnostic information (such as symptom severity).

In addition, regarding Explainable Artificial Intelligence (XAI), five studies were found to use feature importance maps to improve the interpretability of model results [11, 18, 19, 24, 27], five studies to use interpretable models and visualize model structures and processes [9, 10, 16, 20, 28] and one study to use the SHapley Additive explanation (SHAP) method to calculate shapley values [30].

Applications of ML algorithms in pediatric asthma exacerbations (categorized by disease management)

Assessment of risk factors

Eight studies have utilized ML to assess risk factors for pediatric asthma exacerbations, exploring genomic data, environmental elements, and socioeconomic status [11, 16, 17, 21, 25,26,27,28]. Two studies focus on genomics [11, 17], two on social factors [25, 27] and three on environmental factors such as indoor and outdoor air pollutants [16, 21, 26, 28]. Some studies have analyzed the effects of genetic polymorphisms and air pollutants on pediatric asthma exacerbations using Random Forest classifiers and association rule mining techniques [11, 16, 28], and one of them proposed two novel data mining methods (pattern-based decision tree (PBDT) and pattern-based class association rule (PBCAR)) to combine patient biosignals and environmental data for the application of asthma exacerbation. others have used deep learning algorithms, such as Long Short-Term Memory (LSTM) modeling, to predict the risk of pediatric asthma exacerbations and assessed the effects of indoor particulate matter concentrations on PEFRs in pediatric asthma [21]. In addition, studies also evaluated the performance of ML models across different socioeconomic groups, aiming to minimize biases [25, 27]. Together, these studies emphasize the importance of individual differences and environmental factors in the management of pediatric asthma exacerbations.

Diagnosis and prediction of pediatric asthma exacerbations

Nine studies have applied ML to diagnose pediatric asthma exacerbations with high accuracy, leveraging clinical data and patient characteristics [9, 10, 12, 13, 15, 18, 20, 30, 31]. Three studies developed and evaluated Bayesian networks for diagnosing patients in line with treatment guidelines in pediatric emergency departments and predicting exacerbations post-medication withdrawal [9, 10, 20]. Three other studies utilized Volatile Organic Compounds (VOCs) and inflammatory markers to diagnose and predict exacerbations, with one noting high accuracy for certain VOC combinations [13, 15, 18]. One study used this to accurately identify pediatric asthma exacerbations from prehospital records by modifying an existing rule-based computable phenotype (CP) and creating a new machine learning-based CP [30]. One study used an AI-assisted home stethoscope and found that the parameters provided by the device were very effective in detecting pediatric asthma exacerbations [31]. Still another study compared the efficacy of various algorithms, including Bayesian networks, ANNs, SVMs, and Gaussian processes, in predicting asthma exacerbations in the ED, concluding that all achieved high accuracy [12].

Optimization and allocation of medical resources

Three studies explored the use of ML for optimizing medical resource allocation in pediatric asthma exacerbations [19, 22, 24]. One study compared four ML models, combining clinical, environmental, and social data to predict demand for hospitalization, with gradient boosting performing the best results [19]. Another applied automated machine learning algorithms (autoML), which outperforming traditional ML [24]. The third analyzed hospitalization patterns in Japan, founding that antibiotic use and the use of other adjunctive treatments differed significantly between hospitals [22].

Comprehensive asthma management

Three studies have applied ML to comprehensive pediatric asthma exacerbations management [14, 23, 29]. One study combined ML algorithms to predict asthma control exacerbations one week in advance by analyzing clinical and environmental data [14]. Two studies evaluated an artificial intelligence-assisted clinical decision support system (AI-CDS) [23, 29], specifically the Asthma Guidance and Prediction System (A-GPS), which uses EHRs to provide clinical information and predict the risk of asthma exacerbation [23]. While all three studies utilized ML to enhance asthma management, the former study emphasized model development and improvement of prediction ability [14] and the latter two assessed AI-CDS in clinical practice [23, 29].

Quality assessment

The EPHPP quality assessment in Fig. 5 rated 5 studies as strong [13, 15, 18, 23, 27], 15 as moderate [10,11,12, 14, 16, 17, 19,20,21,22, 24,25,26, 30, 31] and 3 as weak [9, 28, 29].

Fig. 5
figure 5

Quality assessment. Note The confounders, blinding, intervention integrity, and analyses did not apply to any of the studies and were therefore removed

Discussion

This systematic review covers a broader and more recent time frame, spanning from Jan 2000 to Jan 2025, providing insights into the evolving use of ML in pediatric asthma exacerbations. It contains more than just predictions [4, 36, 38, 39], covering risk factor assessment, diagnosis and prediction of pediatric asthma exacerbations, optimization and allocation of medical resources, and comprehensive asthma management, offering a more holistic understanding of ML’s role in this domain.

Findings

The review also highlights that majority of ML applications in the included studies were predictive models for pediatric asthma exacerbations. This trend likely reflects the rapid onset and dynamic nature of asthma exacerbation, which can be life-threatening. Prediction of pediatric asthma exacerbations plays a crucial role in enabling preventive interventions and targeted treatment [40]. Furthermore, disease diagnosis can provide physicians with actionable predictive data to support decision-making, enhance healthcare process efficiency, and reduce costs [40,41,42]. ML algorithms have been applied not only in predicting acute exacerbations of chronic obstructive pulmonary disease (COPD) [43,44,45] and acute kidney injury [46, 47], but also in detecting and diagnosing conditions such as asthma, heart disease, and diabetes [48,49,50]. Additionally, ML aids in creating personalized treatment plans for diseases like cancer and rheumatoid arthritis [51, 52], while optimizing healthcare resources management [53, 54].

Because of the broad definition of pediatric asthma exacerbation, this study included studies that explicitly stated the keyword “asthma exacerbation”. This review also found that the definitions of pediatric asthma exacerbations varied across studies, with most studies defining pediatric asthma exacerbations as hospitalization, emergency visits, and specific medical interventions, and the differing definitions of exacerbations lead to non-comparable findings. In addition, different definitions may have an impact on the diagnosis of pediatric asthma exacerbations, with looser definitions potentially including more potential pediatric asthma exacerbations, thus increasing sensitivity, but also diagnosing asthma exacerbations in patients who do not have an asthma exacerbation, which reduces specificity and leads to more false-positive diagnoses; therefore standardizing the definition of pediatric asthma exacerbation could help improve both the quality of the study and the accuracy of the diagnosis [55, 56].

ML research in pediatric asthma is also different from adult asthma. The pathogenesis of pediatric asthma is relatively complex, and many risk factors are still unknown; therefore, in studies of pediatric asthma, the focus is usually on factors associated with child growth and development, such as family history and genetic predisposition [11, 17, 32]. Still, these factors are often difficult to control, and these factors may be less significant in studies of adult asthma exacerbations, which have focused more on lifestyle and environmental factors, such as air pollution and occupational exposures [57, 58]. Thus, in pediatric asthma, controlling certain environmental factors, such as tobacco smoke exposure, pet hair and dust mites, may reduce the risk of asthma exacerbation [26, 32, 59]. In addition, ML models for pediatric asthma focus more on the explanatory nature of the models so that they can be accepted and used by physicians and parents. The interpretability of black-box models (such as the more complex ML and DL models) can be improved using techniques like feature importance analysis and Local Interpretable Model-agnostic Explanations (LIME), which are post-hoc interpretation methods [60, 61]. Alternatively, one can directly use interpretable models, such as linear models and decision trees, which have simple structures and are easy to understand and interpret. In addition, textual interpretation and visualization of models or results can improve the interpretability of models and results to some extent. There is no unified and objective standard for assessing interpretability, and different methods and application scenarios may require different assessment indicators [62]. In addition, approaches integrating multiple ML algorithms have shown promising results in pediatric asthma exacerbation studies, especially when considering multiple meteorological, environmental, and pollen factors [26]. This suggests that combining different machine-learning algorithms may provide more accurate models for pediatric asthma exacerbation studies [12, 14, 17, 21]. Meanwhile, the emerging development of deep learning models has also shown advantages as they can efficiently process and refine the complex nonlinear relationships between risk factors, thus improving the model accuracy [14, 21, 63].

The inclusion of this systematic review revealed that the number of variables entered into the model varied across studies. Upon comparison, it was found that studies with a smaller number of model inputs included a correspondingly smaller number of study subjects, used fewer types of ML algorithms, and had simple models that were easy to manipulate, with high model interpretability, and that the ML in these types of studies was more focused on the application of clinical practice for pediatric asthma exacerbation; whereas, studies with a larger number of model inputs included a very large number of study subjects, and also used multiple ML algorithms for comparison, however crosswise, the accuracy of the resulting models was also higher, and the study focused more on ML method and prediction performance.

When assessing model performance, using a combination of metrics is essential, as relying on a single metric can be misleading. Different metrics capture different aspects of model performance, providing a more comprehensive evaluation. For instance, using accuracy alone can be very deceptive, particularly with unbalanced dataset. In such cases, accuracy may overestimate model effectiveness by favoring the majority class while masking poor performance on minority classes. While AUC (area under the ROC curve) is robust to class imbalanced and offers a holistic view of model performance, it also has limitations. AUC does not convey how well the model performs at particular decision thresholds, which is critical for practical applications. For example, in pediatric asthma exacerbation studies, the performance of the model under specific decision thresholds can directly affect clinical outcomes. Therefore, it is necessary to refer to metrics such as sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). For example, if the model is used for screening and diagnosis of pediatric asthma exacerbations, then high sensitivity and NPV are essential to reduce underdiagnosis and underreporting, while high specificity and PPV help to minimize misdiagnosis and misreporting [40, 64,65,66].

To effectively evaluate model performance, the choice of metrics should align with the characteristics of the dataset and the objectives of the modeling task. AUC is widely used to comprehensively assess a model’s ability to distinguish between categories. It summarizes the model’s performance under different thresholds and is particularly useful for unbalanced datasets. Sensitivity and specificity are critical when evaluating the performance of a model on positive and negative classes. For situations where false positives and false negatives need to be balanced, metrics such as precision and recall (sensitivity) can be combined into an F1 score to provide a balanced assessment, especially in unbalanced datasets. In addition, accuracy is a straightforward metric for measuring the proportion of correct predictions. However, in unbalanced datasets, accuracy can be misleading, and metrics such as Precision-Recall Curve and Area Under Precision-Recall Curve (AUPRC) provide a more accurate picture of model performance.

Ultimately, the selection of evaluation metrics should be tailored to the specific goals of the modeling task. For instance, AUC measures overall discriminatory power, sensitivity and specificity assess category-specific performance, and theF1 scores balances precision and recall, providing a comprehensive understanding of model effectiveness [67,68,69].

Implications and recommendations

Regarding the diagnosis of pediatric asthma exacerbation, the Global Strategy for Asthma Management and Prevention (Updated 2024) can be referred to for a comprehensive diagnosis by combining information from various aspects such as clinical symptoms, medical history, physical examination, and pulmonary function tests. Accurate diagnosis helps to detect signs of asthma exacerbation and take targeted treatment measures. However, it varies from region to region and should be standardized according to local diagnostic criteria [3, 70].

In the clinical use of ML models to assist decision-making, with the help of explanatory tools such as LIME, SHAP, the results of the model can be interpreted to help doctors and parents understand the model’s decisions. For example, through the SHAP method one can clarify which risk factors have a greater impact on the current prediction results, so as to take more targeted preventive measures, but also can choose some of its own ML algorithms with better interpretability, such as decision trees, logistic regression and so on [60,61,62].

When evaluating the performance of the model, in addition to focusing on metrics such as accuracy and AUC, attention should also be paid to metrics such as the sensitivity, specificity, PPV and NPV of the model. These metrics can more comprehensively reflect the performance of the model in practical applications and help doctors and assess the reliability and usefulness of the model [64,65,66, 69].

Strengths and limitations of the study

The strengths of this systemic review lie in its comprehensive search strategy, adherence to a rigorous systematic review methodology and reporting guidelines, and the independent assessment by researchers during title, abstract, and full-text screening, with data extraction verified by multiple reviewers. However, this review has some limitations. First, non-English studies were excluded, which may limit the generalizability of findings. Second, the definition of asthma exacerbation was not standardized across studies, making comparisons difficult. Third, meta-analysis could not be conducted due to significant heterogeneity in study samples, participants, and outcomes.

Future research

Future research on ML in pediatric asthma exacerbations holds considerable promise. Enhancing data quality and diversity is crucial, with the inclusion of broader datasets encompassing pediatric asthma genetic information, environmental factors, and lifestyle habits. Additionally, algorithmic advancements, especially in deep learning, will drive further personalization of treatment, leading to improved efficacy. The integration of real-time monitoring systems using wearables and smart devices will support detection and prevention. Finally, interdisciplinary collaboration among experts in medicine, data science, and engineering will be essential in addressing complex problems and developing more effective asthma management tools.

Conclusions

The systematic review indicates great potential for ML in pediatric asthma exacerbation management, including risk identification, diagnosis, and personalized care. However, challenges such as data quality, algorithm optimization, and interdisciplinary collaboration need to be addressed in clinical practice. Future work should prioritize model robustness, data security, and clinical testing to advance the field.