Despite the increasing use of genomic sequencing in clinical practice, interpreting rare genetic mutations, even among well-studied disease genes, remains difficult. Current predictive models are useful for interpreting those mutations, but they are prone to misclassify those that do not cause diseases, contributing to false positives.
Researchers from the Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) in Dresden, the Center for Systems Biology Dresden (CSBD) in Germany, and the Harvard Medical School in Boston, USA, have developed a tool called Deciphering Mutations in Actionable Genes (DeMAG) published in the journal Nature Communications. DeMAG is an open-source web server (demag.org) that offers an interpretation of the effects of all potential single amino acid mutations could occur in 316 clinically relevant genes that cause diseases for which preventive diagnostics and treatments are already available. DeMAG provides medical professionals with a tool that allows them to more accurately assess the effect of mutations in those genes by reducing the false positive rate, which means that less benign mutations are predicted as pathogenic. As a result, the tool can support clinical decision-making.
In recent years, genomic sequencing has become less expensive and more advanced. On the one hand, this allows clinicians to increasingly use sequencing for diagnostic purposes while also allowing scientists to explore more research hypotheses. On the other hand, many detected mutations do not have a clear clinical interpretation. The uncertainty over whether a mutation causes disease can be stressful for patients and lead to psychological burden, morbidity and health-care expenses associated with under- and overdiagnosis. While existing tools are already used to predict the functional impact of these variants, their performance is biased due to limited clinical data that makes distinguishing between pathogenic (disease-causing) and benign (neutral) variants within a given gene difficult and often lead to misclassifying mutations that do not cause disease as pathogenic. Addressing these difficulties is critical for developing a reliable predictor for clinical applications.
The research group of Agnes Toth-Petroczy at the MPI-CBG and the CSBD teamed up with Christopher Cassa, assistant professor of medicine at the Brigham and Women's Hospital Division of Genetics at the Harvard Medical School, and Ivan Adzhubei, research associate at the Department of Biomedical Informatics at Harvard Medical School, to develop a statistical model and web server DeMAG that reaches high accuracy in the interpretation of genetic mutations in disease genes. To do this, the researchers carefully selected known pathogenic and benign mutations for training the model.
"We used clinical and various population databases. We selected only mutations whose clinical interpretation is agreed upon among multiple submitters such as medical doctors and genetics laboratories. And we also included data from ancestries that are underrepresented in the current population databases, such as Korean or Japanese, to make it even more representative and accurate," explains Federica Luppino, first author of the research paper and PhD student in the Toth-Petroczy group.
DeMAG includes a novel feature, the "partners score", that identifies clusters of amino acids in a protein that share the same clinical effect. With the partners score, DeMAG takes advantage of the amino acid relationships based on evolutionary information from the genomes of many organisms and the recent AI (Artificial Intelligence) revolution of predicting the 3D shapes of proteins using the AlphaFold algorithm developed by Google DeepMind.
Agnes Toth-Petroczy, who supervised the study, concludes, "We provide a basic framework for integrating clinical and protein data to aid assessing the impact of mutations. We hope that our tool and web server will ease variants effect assessment and clinical decision-making. Furthermore, the newly developed features can be applied to other genes and organisms beyond humans."
The DeMAG code is available on GitLab (https://git.mpi-cbg.de/tothpetroczylab/DeMAG) and all data is freely available on the webserver at https://demag.org/.
Luppino, F., et al. (2023). DeMAG predicts the effects of variants in clinically actionable genes by integrating structural and evolutionary epistatic features. Nature Communications. doi.org/10.1038/s41467-023-37661-z.