Early lung cancer detection with a machine learning model based on imaging, clinical, and DNA methylation biomarkers

In a recent study published in The Lancet Digital Health, researchers discuss the development and validation of a combined model comprising imaging, clinical, and cell-free deoxyribonucleic acid (DNA) methylation biomarkers for improved classification of pulmonary nodules and the earlier diagnosis of lung cancer.

Study: Accurate classification of pulmonary nodules by a combined model of clinical, imaging, and cell-free DNA methylation biomarkers: a model development and external validation study. Image Credit: create jobs 51 / Shutterstock.com


Lung cancer accounts for a substantial portion of cancer-associated mortality worldwide. Despite significant progress in the treatment of lung cancer, including chemotherapy, immunotherapy, surgical resection, targeted therapy, and radiotherapy, the prognosis for lung cancer patients remains poor.

The primary cause for the poor prognosis of lung cancer patients is late diagnosis. In fact, lung cancer is often diagnosed when the disease has progressed to stage III or IV, with five-year survival rates for late-stage cancers below 10%.

The early detection of lung cancer, when the disease is in the curable stages of 0–II, can significantly reduce mortality rates. However, the lack of sensitive technologies that can detect lung cancer at early stages, comined with the absence of clinical symptoms in the early stages of lung cancer, are major challenges.

DNA methylation biomarkers are a promising approach for the early detection of lung cancer, as evidence from various studies indicates that DNA methylation in promoter CpG islands and other specific regions indicate events associated with the initiation of tumors. Additionally, the detection of methylation patterns in circulating tumor DNA using next-generation sequencing methods could be used to non-invasively screen for lung cancer.

Low-dose computerized tomography (LDCT) has been effective in the early detection of lung cancer in high-risk populations. Nevertheless, determining the malignancy risk of pulmonary nodules using LDCT remains difficult.

About the study

In the present study, researchers develop a combined model of clinical and imaging biomarkers (CIBM) that uses machine learning algorithms, as well as imaging and clinical features, to classify malignant and benign pulmonary nodules. When combined with a model called PulmoSeek, which is a cell-free DNA methylation model previously designed by the same team of scientists, the CIBM model can detect small-sized pulmonary nodules to ultimately classify lung cancer in the early stages.

Study participants were recruited through a masked, retrospective evaluation study for prospective sample collection from hospitals across 20 Chinese cities. Individuals included in the study were 18 years or older, with 5-30 millimeter (mm) pulmonary nodules that were solitary and non-calcified, as well as solid, part-solid, or pure ground-glass nodules.

A cohort of over 800 samples was used to train the machine-learning algorithm of the CIBM model to classify benign and malignant tumors. The CIBM model was then integrated with PulmoSeek to create a combined model called PulmoSeek Plus.

A decision curve analysis was applied to evaluate the clinical use of the model. Low and high cut-offs for high sensitivity and high specificity, respectively, were used to classify pulmonary nodules into low-, medium-, and high-risk groups. The examined primary outcome was the performance and diagnostic ability of the three models PulmoSeek, CIBM, and PulmoSeek Plus.

Study findings

The PulmoSeek Plus model has the potential to successfully diagnose pulmonary nodules as benign or malignant in the early stages. When combined with LDCT, PulmoSeek Plus could be a robust tool for the early clinical assessment and management of lung cancer. Moreover, the only requirements for the integrated model were non-invasively collected blood samples and CT images.

Combining CIBM with the PulmoSeek model increased the sensitivity of the classification of pulmonary nodules by 6% and negative predictive value by 24%. Furthermore, the performance of the model was robust across pulmonary nodules of different types, sizes, and stages.

The sensitivities of characterization for early-stage nodules, as well as those smaller than one centimeter in size were 0.98 and 0.99, respectively. For sub-solid nodules, which are difficult to characterize using LDCT results alone, the characterization sensitivity was 100%.


The integrated PulmoSeek Plus model incorporates imaging, clinical, and cell-free DNA methylation biomarkers, as well as a machine-learning algorithm, for the early detection and classification of pulmonary nodules.

The validation of this model using independent cohorts confirms the high sensitivity and robust performance of PulmoSeek Plus across a range of samples. When combined with LDCT, PulmoSeek Plus could facilitate the early detection of lung cancers, thus improving the prognosis for many lung cancer patients.

Journal reference:
Dr. Chinta Sidharthan

Written by

Dr. Chinta Sidharthan

Chinta Sidharthan is a writer based in Bangalore, India. Her academic background is in evolutionary biology and genetics, and she has extensive experience in scientific research, teaching, science writing, and herpetology. Chinta holds a Ph.D. in evolutionary biology from the Indian Institute of Science and is passionate about science education, writing, animals, wildlife, and conservation. For her doctoral research, she explored the origins and diversification of blindsnakes in India, as a part of which she did extensive fieldwork in the jungles of southern India. She has received the Canadian Governor General’s bronze medal and Bangalore University gold medal for academic excellence and published her research in high-impact journals.


Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Sidharthan, Chinta. (2023, August 15). Early lung cancer detection with a machine learning model based on imaging, clinical, and DNA methylation biomarkers. News-Medical. Retrieved on September 26, 2023 from https://www.news-medical.net/news/20230815/Early-lung-cancer-detection-with-a-machine-learning-model-based-on-imaging-clinical-and-DNA-methylation-biomarkers.aspx.

  • MLA

    Sidharthan, Chinta. "Early lung cancer detection with a machine learning model based on imaging, clinical, and DNA methylation biomarkers". News-Medical. 26 September 2023. <https://www.news-medical.net/news/20230815/Early-lung-cancer-detection-with-a-machine-learning-model-based-on-imaging-clinical-and-DNA-methylation-biomarkers.aspx>.

  • Chicago

    Sidharthan, Chinta. "Early lung cancer detection with a machine learning model based on imaging, clinical, and DNA methylation biomarkers". News-Medical. https://www.news-medical.net/news/20230815/Early-lung-cancer-detection-with-a-machine-learning-model-based-on-imaging-clinical-and-DNA-methylation-biomarkers.aspx. (accessed September 26, 2023).

  • Harvard

    Sidharthan, Chinta. 2023. Early lung cancer detection with a machine learning model based on imaging, clinical, and DNA methylation biomarkers. News-Medical, viewed 26 September 2023, https://www.news-medical.net/news/20230815/Early-lung-cancer-detection-with-a-machine-learning-model-based-on-imaging-clinical-and-DNA-methylation-biomarkers.aspx.


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
You might also like...
Scientists identify mutations in 11 genes associated with aggressive forms of prostate cancer