A recent Scientific Reports study discusses the development of an artificial intelligence (AI) prognostic model for surgically resected non-small cell lung cancer (NSCLC).
Study: Development of artificial intelligence prognostic model for surgically resected non-small cell lung cancer. Image Credit: poylock19 / Shutterstock.com
NSCLC is the most common type of lung cancer worldwide and is typically treated with surgical resection, followed by chemotherapy and radiotherapy. Although patient prognosis after surgery for NSCLC is determined based on the tumor stage according to TNM classification, this prognosis is not always consistent with the actual occurrence. Thus, there remains an urgent need for better prognostic tools to accurately predict a patient's prognosis and formulate better treatment strategies.
Multiple prognostic factors for postoperative prognosis associated with NSCLC have been identified, including geriatric nutritional risk index, Glasgow prognostic score, neutrophil/lymphocyte ratio, C-reactive protein (CRP)/albumin ratio, prognostic nutritional index, platelet/lymphocyte ratio, and monocyte/lymphocyte ratio. To date, few studies have described the importance of blood test results in NSCLC prognosis.
Previous studies have highlighted the importance of AI in medicine, as demonstrated by the recent application of AI for the early diagnosis of lung cancer. AI-based models have also been developed to predict the therapeutic efficacy of chemotherapy.
About the study
The current study discusses the development of an AI prognostic model for NSCLC using machine learning (ML). This model used preoperative and postoperative blood test results for its predictions.
A total of 1,049 patients with pathological stage (p-Stage) I-IIIA NSCLC who underwent surgery between January 2003 and December 2016 were recruited for the study. The median age of the participants at surgery was 69 years, about 58% of whom were male.
The patient's clinical information and follow-up data were obtained from the electronic health record system. Some of the clinicopathological characteristics considered were age at surgery, body mass index (BMI), sex, smoking history, forced vital capacity (FVC), forced expiratory volume in one second (FEV1.0), surgical procedure, histological type, and adjuvant chemotherapy.
Preoperative and postoperative blood test data were assessed. Carcinoembryonic antigen (CEA) and cytokeratin-19 fragments (CYFRA) data for the three-month period before surgery were also analyzed.
XGBoost, a decision-free model, was selected as the algorithm for this AI prognostic model. XGBoost is advantageous as compared to other AI tools due to its ability to use missing values directly as information.
Most of the study participants underwent lobectomy, followed by wedge resection, segmentectomy, bilobectomy, and pneumonectomy. Furthermore, most patients were diagnosed with p-Stage IA NSCLC.
The Kaplan-Meier curve provided information on disease-free survival (DFS), overall survival (OS), and cancer-specific survival (CSS) rates of the overall cohort and according to p-Stage. After 5.06 years, the number of OS, DFS, and CSS events was 214, 214, and 123, respectively.
The newly developed AI prognostic model used time-dependent receiver operating characteristic (ROC) curves and area under the curve (AUC) values to predict DFS, OS, and CSS, all of which were associated with good prediction accuracy. Notably, the predicted probability of outcome events at five years following surgery was highly accurate.
The prediction accuracy for five-year DFS, OS, and CSS was reflected by AUC values of 0.890, 0.926, and 0.960, respectively. This prediction accuracy was comparable with the accuracy levels of previous models.
Histological analysis revealed that 81.5% of the patients were associated with carcinoids. However, many other histological types were detected, including squamous cell carcinoma, micropapillary/solid predominant adenocarcinoma, lepidic predominant adenocarcinoma, acinar/papillary predominant adenocarcinoma, and large cell neuroendocrine carcinoma at varied percentage.
Histological type was found to be one of the most important factors of prognosis in this AI model. The prognoses of adenosquamous, pleomorphic, and large-cell neuroendocrine carcinoma were worst compared to other histological types. Thus, a more detailed analysis of histological type would improve the prognostic accuracy.
CEA, CYFRA, coagulation-related factors, and immuno-nutrition indices significantly contributed to patient prognosis. Interestingly, factors that reflect liver and renal function, including creatinine, urea nitrogen, and aspartate aminotransferase, also contributed to the prognosis of NSCLC.
The current study has some limitations, including the consideration of participants from a single institution. In the future, a similar study utilizing data from multiple cohorts at different institutions is needed to validate these findings.
Another limitation of this study is that the formula used in the XGBoost model was difficult to verify. However, bootstrap validation was performed to confirm its prediction accuracy.
Despite these limitations, the current study demonstrated that utilization of a large amount of blood test results is a promising approach for an accurate prognosis of NSCLC. The newly developed AI prognostic model was associated with good prediction accuracy of postoperative prognosis for surgically resected NSCLC.
- Kinoshita, F., Takenaka, T., Yamashita, T., et al. (2023) Development of artificial intelligence prognostic model for surgically resected non-small cell lung cancer. Scientific Reports, 13(1);1-10. doi:10.1038/s41598-023-42964-8