Two studies led by Johns Hopkins Kimmel Cancer Center, Ludwig Center, and Johns Hopkins Whiting School of Engineering researchers report on a powerful new method that significantly improves the reliability and accuracy of artificial intelligence (AI) for many applications. As an example, they apply the new method to early cancer detection from blood samples, known as liquid biopsy.
One study reports on the development of MIGHT (Multidimensional Informed Generalized Hypothesis Testing), an AI method that the researchers created to meet the high level of confidence needed for AI tools used in clinical decision making. To illustrate the benefits of MIGHT, they used it to develop a test for early cancer detection using circulating cell-free DNA (ccfDNA)-fragments of DNA circulating in the blood.
A companion study found that ccfDNA fragmentation patterns used to detect cancer also appear in patients with autoimmune and vascular diseases. To develop a test with high sensitivity for cancer but reduced false-positive results, MIGHT was expanded to incorporate data from autoimmune and vascular diseases obtained from colleagues at Johns Hopkins and other institutions who treat and study these diseases.
The studies, supported in part by the National Institutes of Health, are to be published on Aug. 18 in the Proceedings of the National Academy of Sciences.
A related article, authored by three researchers from Johns Hopkins, Pixar co-founder Ed Catmull, Ph.D., and Microsoft chief data scientist of the AI for Good Lab Juan Lavista Ferres, was published concurrently in Cancer Discovery, a publication of the American Association for Cancer Research. It discusses the challenges of incorporating AI into clinical practice, including challenges addressed by MIGHT.
MIGHT fine-tunes itself using real data and checks its accuracy on different subsets of the data, using tens of thousands of decision-trees, and can be applied to any field employing big data, ranging from astronomy to zoology. It is particularly effective for the analysis of biomedical datasets with many variables but relatively few patient samples, a common situation in which traditional AI models often falter.
In tests using patient data, MIGHT consistently outperformed other AI methods in both sensitivity and consistency. It was applied to the blood of 1,000 individuals-352 patients with advanced cancers and 648 individuals without cancer.
For each sample, the researchers evaluated 44 different variable sets, each consisting of a set of biological features, such as DNA fragment lengths or chromosomal abnormalities, and found that aneuploidy-based features (an abnormal number of chromosomes) delivered the best cancer detection performance with a sensitivity of 72% (ability to detect cancer) at 98% specificity (correctly identified those who were cancer free). This balance is critical in real-world medical applications where minimizing false positives is necessary to avoid unneeded procedures.
MIGHT gives us a powerful way to measure uncertainty and increase reliability, especially in situations where sample sizes are limited but data complexity is high."
Joshua Vogelstein, PhD, Study Lead Investigator and Associate Professor, Biomedical Engineering, Johns Hopkins Medicine
MIGHT was also extended to a companion algorithm, called CoMIGHT, to determine whether combining multiple variable sets could improve cancer detection.
The researchers applied CoMIGHT to blood samples from 125 patients with early-stage breast cancers and 125 patients with early-stage pancreatic cancer, which were analyzed along with 500 controls (participants without cancer). While pancreatic cancers were more often detected than breast cancers, CoMIGHT analysis suggested that early-stage breast cancer might benefit from combining multiple biological signals, highlighting the tool's potential for tailoring detection strategies by cancer type.
In the companion study, researchers Christopher Douville, PhD, assistant professor of oncology, Samuel Curtis, PhD, postdoctoral fellow in the Ludwig Center, and their teams serendipitously discovered that ccfDNA fragmentation signatures previously believed to be specific to individuals with cancer also occur in patients with other diseases, including autoimmune conditions such as lupus, systemic sclerosis and dermatomyositis, and vascular diseases like venous thromboembolism.
Among individuals with abnormal fragmentation signatures, they found an increase in inflammatory biomarkers in all patients, whether they had autoimmune diseases, vascular disease, or cancer. Their results suggest that inflammation-rather than cancer per se, is responsible for fragmentation signals, complicating efforts to use ccfDNA fragmentation as a biomarker specific for cancer.
To address the challenge of misconstruing inflammation for cancer, the team added information characteristic of inflammation in its training data for MIGHT. The enhanced version reduced, but did not completely eliminate, the false-positive results from non-cancerous diseases.
"Our main goal was to further investigate the biological mechanisms responsible for fragmentation signatures that have previously been thought to be specific for cancer," says Curtis. "As the field moves to more complex biomarkers, understanding the underlying biological mechanisms leading to the results are critical to their interpretation, particularly to avoid false positive results. Our new data indicate that patients with diseases other than cancer can be mistakenly believed to have cancer unless appropriate safeguards are incorporated into the tests."
Adds Douville, "A silver lining of this study is that reworking of MIGHT could result in a separate diagnostic test for inflammatory diseases."
Together, the studies demonstrate the promise as well as the complexities of developing trustworthy clinical technologies using AI. In a related editorial, researchers noted several critical challenges that need to be addressed so that tools like MIGHT can be fully integrated into clinical practice.
They identified eight key barriers to bringing AI into routine clinical care. In simple terms, these include the false expectation that AI tools need to be flawless before they're considered useful; the need to present results as probabilities rather than simple yes-or-no answers; making sure AI predictions match real-world probabilities; ensuring results are reproducible; training models on diverse populations; explaining how AI makes decisions; recognizing how test accuracy can change when diseases are rare; and avoiding over-reliance on computer-generated recommendations.
"MIGHT could be applied to any field where measuring uncertainty and having confidence in the reliability and reproducibility of findings is key. This could be in the natural sciences, social sciences, or medical sciences. Research across all fields of science requires confidence that what the algorithm is spitting out is real, reproducible, and reliable," says Joshua Vogelstein.
The researchers say results obtained using AI technologies should be viewed as AI-informed data that can complement but not replace clinical judgment. Although MIGHT and CoMIGHT offer powerful new tools in cancer detection, and potentially inflammatory and vascular disease detection, they say that further clinical trials and validation are necessary before such tests can be extended to clinical use.
"Trust in the result is essential, and now that there is a reliable, quantitative tool in MIGHT, we and other researchers can use it and focus our efforts on studying more patients and adding statistically meaningful features to our tests for earlier cancer detection," says Bert Vogelstein, M.D., Clayton Professor of Oncology, co-director of the Ludwig Center, Howard Hughes Medical Institute investigator, and study co-leader.
MIGHT and its companion algorithm, CoMIGHT, are now publicly available at treeple.ai.