In a recent study published in the American Journal of Roentgenology, researchers developed a deep learning (DL) model to estimate 30.0-day mortality risk among community-acquired pneumonia (CAP) patients using chest x-rays obtained for diagnosis as inputs.They also validated the model's performance among patients from different institutions and periods.
Study: A Deep-Learning Model Using Chest Radiographs for Prediction of 30-Day Mortality in Patients With Community-Acquired Pneumonia: Development and External Validation. Image Credit: AndreySuslov/Shutterstock.com
CAP, a common cause of pneumonia, is associated with considerable mortality and medical resource utilization. Chest radiography is an essential tool for diagnosing CAP and stratifying risk.
However, incorporating chest radiograph findings into risk prediction tools has been limited due to inter-reader variability and difficulty extracting objective biomarkers. The CURB-65 score and pneumonia severity index are currently available tools for predicting adverse outcomes in CAP patients.
About the study
In the present retrospective study, researchers developed and externally validated a DL-based model predictor of death within 30.0 days among CAP patients using initial chest radiographs.
The model was developed to predict the any-cause 30.0-day mortality risks for CAP patients using their initial chest X-rays.
The study involved searching the electronic medical records (EMRs) of a single tertiary referral institution for individuals who received CAP diagnosis during any healthcare encounter between March 2013 and December 2019.
The team evaluated the deep learning model among individuals diagnosed with community-acquired pneumonia in emergency departments at the institution where the development group was diagnosed between January and December of 2020 (the temporal test group, 947 individuals).
They also evaluated the model at two other institutions, i.e., the Seoul Metropolitan Government-Seoul National University Boramae Medical Center (external test group A, 467 individuals) between January and March 2020, and Chung-Ang University Hospital (external test group B, 381 individuals) between March 2019 and October 2021.
The development cohort included patients diagnosed with CAP during any encounter, while the subsequent test cohorts included only patients diagnosed with CAP during emergency department encounters. The team compared the area under the curve (AUC) values between the deep learning model and the CURB-65 tool, and the combination approach results were evaluated by logistic regression modeling.
The primary outcome measure was any-cause mortality within 30.0 days of CAP diagnosis. A convolutional neural network (CNN) was developed for predicting 30.0-day mortality after CAP diagnosis based on chest radiography scans from patients in the developmental cohort.
The model outputs represented the conditional survival probabilities at different time intervals, and an experienced thoracic radiologist performed a post hoc analysis of class activation maps.
The deep learning model was devised with a 3.0: 1.0: 1.0 participant distribution to the training group, validation group, and internal testing group to estimate the 30.0-day any-cause death risk among CAP patients with their chest radiography scans analyzed at the time of diagnosis as inputs.
Mortality data were confirmed using EMRs, or death registry data, from the Ministry of the Interior and Safety, Republic of Korea.
The study analyzed the 30-day mortality rate of 1,421 patients in the developmental cohort, including 1,421 patients in the internal test set.
The AUC values for the estimated 30.0-day mortality risks were greater for the deep learning model compared to CURB-65 among temporal test group participants (0.80 vs. 0.70) but not statistically significant among those belonging to the external test groups A (0.8 versus 0.7) and B (0.8 versus 0.7).
Compared to CURB-65, the DL model had similar sensitivity but higher specificity, with a positive predictive value (PPV) of 35% vs. 18% and a negative predictive value (NPV) of 95% vs. 94%. The DL model exhibited acceptable calibration in the temporal test group but significantly overestimated the risk of 30-day mortality in external test cohorts A and B.
The DL model was a significant predictor of 30-day mortality, with a 1.08 odds ratio for a 1.0% increase in the predicted risk after adjusting for CURB-65 scores.
In the external test groups A and B, the DL model, CURB-65 scores, and combined model showed qualitatively similar decision curves, with modest improvements in net positive benefit for the deep learning model and the combined model compared to the CURB-65 scores.
Pneumonia images influenced the DL model's predictions in high-predicted-risk patients, while those with low-predicted-risk patients were influenced by other areas of the image.
The model was not influenced by irrelevant features like radiograph markers or extrinsic materials. The post hoc assessment of class activation maps showed that the DL model's predictions were largely accurate.
Overall, the study findings showed that the deep learning model could estimate mortality within 30.0 days of CAP diagnosis using chest X-rays obtained for diagnosis with superior performance than the CURB-65 tool.
The model yielded an AUC of 0.77 to 0.80, with higher specificity (ranging between 61% and 69%) compared to CURB-65 (ranging between 44% and 58%) at similar sensitivity.
The model can guide decision-making and improve CAP outcomes by identifying high-risk patients (those requiring hospitalization and intensive treatment, including intravenous antibiotic therapy or respiratory support).
In contrast, early home discharge and conservative treatment for low-risk patients can reduce unnecessary medical resource utilization.