AI model flags hidden breast cancers years before diagnosis in routine mammograms

A large NHS screening study shows that artificial intelligence can detect subtle signals in “normal” mammograms that reveal which women are most likely to develop aggressive interval cancers years before they appear.

Study: Performance of breast cancer risk prediction algorithms across mammography systems in the UK screening programme. Image Credit: CameraCraft / Shutterstock

Study: Performance of breast cancer risk prediction algorithms across mammography systems in the UK screening programme. Image Credit: CameraCraft / Shutterstock

In a recent study published in the journal npj Digital Medicine, researchers conducted a large-scale (n = 112,621) retrospective validation study to evaluate the performance of four state-of-the-art Deep Learning (DL) algorithms for predicting “interval cancers”. These cancers account for approximately 30% of cancers diagnosed after a negative screening mammogram but before the next scheduled screening examination in screening programs and represent a critical diagnostic gap in current mammogram-based screening approaches.

The study's findings revealed the academic DL model Mirai (developed by MIT) as the best-performing model (interval cancer AUC = 0.77). The model identified about 27.5% of interval cancers in the study cohort by flagging the top 4% of “normal” (negative) screening mammogram images as the highest risk.

While the study noted that model performance varied slightly across the specific machines used to produce mammogram images and that one algorithm showed statistically significant differences between systems, these findings suggest that DL tools could potentially support risk-stratified breast cancer screening strategies, although prospective clinical evaluation would be required before implementation.

Background: The Challenge of Interval Breast Cancers

For decades, breast cancer screening recommendations have involved women receiving a mammogram once every few years (e.g., every 3 years in the United Kingdom [UK]). However, a growing body of evidence suggests that while these periodic screenings are necessary and effective at detecting most breast cancers, they fail to identify “interval cancers”, cancers diagnosed after a negative screening mammogram but before the next scheduled screening.

These “hidden” cancers, which are observed to develop or become clinically apparent in the periods between screening schedules, are often significantly more aggressive than those detected in routine mammograms, leading to worse prognosis and clinical outcomes, including death.

Traditional approaches to addressing interval cancers have involved clinicians attempting to predict individual risk via genetic assessments (such as polygenic risk scores, which are not routinely implemented in most population screening programs) and family history evaluations (often incomplete).

However, recent advances in Deep Learning (DL) algorithms have led researchers to hypothesize that these Artificial Intelligence (AI) models, trained on millions of mammogram images, may be able to recognize subtle imaging patterns and tissue characteristics in breast tissue that human radiologists might overlook.

Unfortunately, given the wealth of commercial and academic DL models currently available, clinicians do not yet know which model to choose and whether these tools can perform well enough to be included in personalized care.

Study Objective and Model Comparison

The present study aimed to address this knowledge gap by conducting a head-to-head comparison of the breast cancer predictive performance of four of today’s most advanced DL models: Mirai (MIT), iCAD ProFound AI Risk (a commercially available model), Transpara Risk (another commercially available DL tool), and Google Health’s Risk Model.

Validation Dataset From the UK NHS Screening Program

These models were provided with an extensive retrospective validation dataset from the UK’s National Health Service (NHS). The dataset comprised high-resolution “negative” (cancer-free) screening mammograms (n = 112,621) collected between 2014 and 2017 from two distinct NHS screening sites.

Model performance was validated by tracking participants for five years to observe which women eventually developed breast cancers (approximately 1,225 cancers across the follow-up period), including interval cancers.

Evaluation Across Mammography Hardware Platforms

To evaluate the generalizability of algorithm performance across different mammography hardware platforms, DL models were trained on mammography images from different hardware ecosystems, specifically machines from Philips and GE.

Predictive Performance of Deep Learning Models

The study findings revealed that the academic algorithm Mirai consistently demonstrated the highest predictive power (Area Under the Curve [AUC] = 0.72; p < 0.001). While iCAD (AUC = 0.70), Google (AUC = 0.68), and Transpara (AUC = 0.65) achieved lower scores, their predictive performance was still notable given that the input mammograms had previously been interpreted as “normal” during routine screening.

Identification of High-Risk Patients for Interval Cancers

Study observations indicated that these models could identify future interval cancers from screening examinations initially interpreted as negative (Mirai’s interval cancer AUC = 0.77). When researchers tested the top 4% of women identified by Mirai as being “highest risk,” about 27.5% of all interval cancers in the cohort occurred within this high-risk group during follow-up.

Expanding this high-risk group to the top 14% of women was observed to double the interval cancer detection yield, capturing approximately 50.3% of all future interval cancers in the cohort.

Performance Across Mammography Machine Manufacturers

The study also evaluated whether algorithm performance differed across mammography machine manufacturers. Researchers found that three of the four evaluated models performed statistically similarly on images generated by Philips and GE machines. While the Transpara model performed better on images generated by GE machines than on those generated by Philips machines, the difference was relatively modest (AUC = 0.69 versus 0.62).

The researchers also highlight several limitations, including the exclusion of mammograms with implants or non-standard imaging views, incomplete ethnicity data, and the possibility that results may not fully generalize to mammography systems from other major vendors. The authors also note that retrospective validation may underestimate the potential clinical utility, since some cancers might be detected through additional imaging pathways rather than solely through symptomatic presentation.

Conclusions: Toward Risk-Stratified Breast Cancer Screening

The present study provides evidence suggesting that DL models can identify previously unrecognized imaging signals from standard mammograms to predict future cancer risk. Models such as MIT’s Mirai were shown to identify and flag a significant proportion of interval cancers in a small group of high-risk women.

Future work should aim to investigate these results in prospective clinical trials and real-world screening settings before such tools can be integrated into personalized screening protocols.

Journal reference:
  • Rothwell, J., et al. (2026). Performance of breast cancer risk prediction algorithms across mammography systems in the UK screening programme. npj Digital Medicine. DOI, 10.1038/s41746-026-02507-7, https://www.nature.com/articles/s41746-026-02507-7
Hugo Francisco de Souza

Written by

Hugo Francisco de Souza

Hugo Francisco de Souza is a scientific writer based in Bangalore, Karnataka, India. His academic passions lie in biogeography, evolutionary biology, and herpetology. He is currently pursuing his Ph.D. from the Centre for Ecological Sciences, Indian Institute of Science, where he studies the origins, dispersal, and speciation of wetland-associated snakes. Hugo has received, amongst others, the DST-INSPIRE fellowship for his doctoral research and the Gold Medal from Pondicherry University for academic excellence during his Masters. His research has been published in high-impact peer-reviewed journals, including PLOS Neglected Tropical Diseases and Systematic Biology. When not working or writing, Hugo can be found consuming copious amounts of anime and manga, composing and making music with his bass guitar, shredding trails on his MTB, playing video games (he prefers the term ‘gaming’), or tinkering with all things tech.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Francisco de Souza, Hugo. (2026, March 09). AI model flags hidden breast cancers years before diagnosis in routine mammograms. News-Medical. Retrieved on March 09, 2026 from https://www.news-medical.net/news/20260309/AI-model-flags-hidden-breast-cancers-years-before-diagnosis-in-routine-mammograms.aspx.

  • MLA

    Francisco de Souza, Hugo. "AI model flags hidden breast cancers years before diagnosis in routine mammograms". News-Medical. 09 March 2026. <https://www.news-medical.net/news/20260309/AI-model-flags-hidden-breast-cancers-years-before-diagnosis-in-routine-mammograms.aspx>.

  • Chicago

    Francisco de Souza, Hugo. "AI model flags hidden breast cancers years before diagnosis in routine mammograms". News-Medical. https://www.news-medical.net/news/20260309/AI-model-flags-hidden-breast-cancers-years-before-diagnosis-in-routine-mammograms.aspx. (accessed March 09, 2026).

  • Harvard

    Francisco de Souza, Hugo. 2026. AI model flags hidden breast cancers years before diagnosis in routine mammograms. News-Medical, viewed 09 March 2026, https://www.news-medical.net/news/20260309/AI-model-flags-hidden-breast-cancers-years-before-diagnosis-in-routine-mammograms.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Tick-derived protein discovery can advance treatment for inflammatory diseases