In a recent study published in The Lancet Digital Health, researchers performed a meta-analysis to evaluate the quality and performance of deep learning and machine learning models for long-term chronic obstructive pulmonary disease (COPD) prognosis. They also compared the models with previous predictive regression models.
Study: Machine learning and deep learning predictive models for long-term prognosis in patients with chronic obstructive pulmonary disease: a systematic review and meta-analysis. Image Credit: LALAKA/Shutterstock.com
COPD is a significant cause of mortality worldwide, and the expenses of treating it are likely to rise. Deep learning and machine learning models are increasingly used for predicting long-term progression in COPD patients to save costs and enhance healthcare delivery efficiency.
Deep learning uses hierarchical structures to learn complicated structures and relationships, whereas machine learning uses algorithms to learn rules and correlations from data. Deep learning is gaining popularity as a means of identifying COPD patients at risk of poor outcomes.
About the study
In the present meta-analysis, researchers comparatively assessed the prognostic performance of deep learning and machine learning models for COPD.
The team searched the Cochrane Library, PubMed, ProQuest, Embase, Web of Science, and Scopus databases from study inception until 6 April 2023 for relevant studies published in English that used deep learning or machine learning to estimate patient outcomes ≥6.0 months after the initial COPD diagnosis. Hand searches of included study references and systematic reviews were completed to identify missed studies.
The team included studies with human participants aged 18 to 90 years with a history of COPD for ≥6.0 months, allowing for input modalities.
The studies documented the area under the receiver operator characteristic (ROC) curve (AUC) values for estimating deaths, exacerbations, and forced expiratory volume reduction in one second (FEV1). Retrospective and prospective cohort studies, cross-sectional studies, case-control studies, and randomized clinical trials were included.
Prognostic studies for COPD development among individuals who did not initially have COPD were excluded, as were prognostic studies that did not include at least six months of follow-up. Two researchers independently screened the data, and disagreements were resolved by discussion or consulting another researcher.
The researchers assessed heterogeneity using Cochran’s Q statistic (I² value above 50% indicated significant heterogeneity). They evaluated bias risks and reporting quality using the PROBAST and TRIPOD checklists, respectively.
Machine learning was described as more complex algorithms than regression models, including support vector machines and random forest analysis, that could learn to make judgments based on patterns in data.
Deep learning was described as the use of neural networks with ≥2.0 hidden layers. Hospitalization was used as a proxy for aggravation. Changes in six-minute walk test performance, dyspnea, or COPD load, as determined by questionnaires or psychophysical measures, were utilized as surrogate indicators for FEV1 reduction.
The team initially identified 3,620 studies, of which 18 met the eligibility criteria, and among them, six and 12 used deep learning and machine learning models, respectively.
Seven models assessed exacerbation risks; however, only six reported AUC values (combined AUC, 0.8). The I² value of 97% indicated significant heterogeneity in the included studies.
Eleven models estimated death risks; however, only six reported AUC values (combined AUC, 0.8) with significant heterogeneity (I² of 60%). Two studies evaluated reductions in pulmonary function but could not be pooled.
The reduction in pulmonary function of >30 mL over five years was predicted with AUC 0.8 on a cohort of 42 individuals using five-fold cross-validation, and the overall decline in lung volume was predicted with AUC 0.7 among 4,496 patients using 10-fold nested cross-validation.
Deep learning and machine learning-based models showed no significant improvement in estimating exacerbations compared to pre-existing scores for disease severity. Three studies comparatively assessed machine learning-based models and disease severity ratings for mortality estimation, and five external validation studies showed performance worse than regression modeling.
Bias risks were most attributed to mishandling missing information, using datasets that were too small, and not documenting model uncertainty.
Meta-analysis indicated no significant difference in pooled AUC between the machine learning vs. conventional regression models (pooled AUC values, 0.77 vs. 0.75).
The study found that 50% of 12 machine learning studies had an event per variable ratio below 10, with a median number of positive events of 82 and 154 predictive variables. The remaining six studies had an EPV above 20, with a median positive event of 1,645 and a predictive variable of 11.5.
Based on the study findings, there is scant evidence that deep learning and machine learning models outperform pre-existing COPD severity scores for long-term COPD prognosis. Researchers should follow the PROBAST and TRIPOD criteria to increase the study findings' reproducibility.
Due to the quantity of variables necessary, conventional models have limited clinical applicability. However, there are more chances in clinical practice for deep learning assessment of computed tomography (CT) data.