Machine learning model guides smarter gene selection in newborn screening

Download PDF Copy

Reviewed

Mass General BrighamMay 10 2025

More than a decade ago, researchers launched the BabySeq Project, a pilot program to return newborn genomic sequencing results to parents and measure the effects on newborn care. Today, over 30 international initiatives are exploring the expansion of newborn screening using genomic sequencing (NBSeq), but a new study by researchers from Mass General Brigham highlights the substantial variability in gene selection among those programs. In a paper published in Genetics in Medicine, an official journal of the American College of Medical Genetics and Genomics, they offer a data-driven approach to prioritizing genes for public health consideration.

It's critical that we be thoughtful about which genes and conditions are included in genomic newborn screening programs. By leveraging machine learning, we can provide a tool that helps policymakers and clinicians make more informed choices, ultimately improving the impact of genomic screening programs."

Nina Gold, MD, co-senior author, director of Prenatal Medical Genetics and Metabolism at Massachusetts General Hospital (MGH)

The authors introduce a machine learning model that brings structure and consistency to the selection of genes for NBSeq programs. This is the first publication from the International Consortium of Newborn Sequencing (ICoNS), founded in 2021 by senior author Robert C. Green, MD, MPH, director of the Genomes2People Research Program at Mass General Brigham, and David Bick, MD, PhD, of Genomics England in the United Kingdom.

Researchers analyzed 4,390 genes included across 27 NBSeq programs, identifying key factors influencing gene inclusion. While the number of genes analyzed by each program ranged from 134 to 4,299, only 74 genes (1.7%) were consistently included in over 80% of programs. The strongest predictors of gene inclusion were whether the condition is on the U.S. Recommended Uniform Screening Panel, has robust natural history data, and if there is strong evidence of treatment efficacy.

Using these insights, the team developed a machine learning model incorporating 13 predictors, achieving high accuracy in predicting gene selection across programs. The model provides a ranked list of genes that can adapt to new evidence and regional needs, enabling more consistent and informed decision-making in NBSeq initiatives worldwide.

"This research represents a significant step toward harmonizing NBSeq programs and ensuring that gene selection reflects the latest scientific evidence and public health priorities," said Green.

Source:

Mass General Brigham

Journal reference:

Minten, T., et al. (2025). Data-driven consideration of genetic disorders for global genomic newborn screening programs. Genetics in Medicine. doi.org/10.1016/j.gim.2025.101443.

Posted in: Child Health News | Genomics | Device / Technology News