Writing in the May 22, 2023 issue of Cell Systems, a diverse team of scientists, led by researchers at University of California San Diego School of Medicine, have produced a novel map that depicts the human body's enormously complicated and highly evolved system for addressing and repairing DNA damage -; a cause and consequence of many diseases.
Damage to DNA and replication errors caused by stress and other factors play a major role in disease, and are a hallmark of cancer and other afflictions. To maintain the integrity of the genome and support normal functioning and health, cells have evolved an intricate network of cell-cycle checkpoints and DNA damage repair tools, collectively known as DNA damage response or DDR.
Defects in DDR are linked to numerous diseases, including cancer and heritable neurological disorders caused by unstable DNA, erroneous repeats, rearrangements and mutations. Conversely, better understanding how DDR works and why it sometimes fails provides new therapeutic opportunities to treat or cure the same diseases.
The ongoing challenge, of course, is that DDR is an extremely complex system involving hundreds of different proteins assembling in different ways to address different problems. You can't fix a problem with DDR until you understand how it works."
Trey Ideker, PhD, senior author, professor at UC San Diego School of Medicine and UC San Diego Moores Cancer Center
In the new paper, Ideker and colleagues take a major step forward in elucidating the complexities and functions of DDR, producing a multi-scale map of protein assemblies in DDR.
Unlike earlier maps, based on published scientific literature that included conflicting findings or tend to focus only on well-studied mechanisms, the new reference map employs affinity purification mass spectrometry and a broad collection of multi-omics data to develop a fuller picture: a hierarchical organization of 605 proteins in 109 assemblies that captures canonical repair mechanisms and proposes new DDR-associated proteins linked to stress, transport and chromatin functions within cells.
Multi-omics is a new approach in which data sets of different omics groups are combined during analysis to create a more complete and nuanced understanding of whole systems and organisms.
The cell contains different classes of molecular processes: genomics, transcriptomics, proteomics and others. Each of these "omics" molecular processes involves interactions between thousands of genes, transcripts or proteins. To make sense of this complexity, scientists have tended to take a reductionist view, examining omics one at a time.
In contrast, systems biology considers molecular processes simultaneously and holistically, using machine learning and other tools to evaluate to what extent different molecular processes inform any given interaction, and how whole systems and networks work. Machine learning describes computer systems that are able to learn and adapt without following explicit instructions. It is an application of artificial intelligence.
"Experimental screens of ever-increasing scale are capturing interactions between genes or proteins in human cells, often beyond what has been described in literature. They can, in principle, be used to create data-driven maps of DDR," said first author Anton Kratz, PhD, formerly a research scientist in Ideker's lab who now works at The System Biology Institute in Tokyo, Japan.
But screening presents its own challenges since different forms may measure molecular processes in isolation, missing some interactions that appear only under certain stresses or conditions. To address these challenges, the researchers measured new protein-protein interaction networks centered around 21 key DDR factors with and without DNA damage. They developed a machine learning approach to combine new data with existing data, and statistical analysis that showed the results significantly informed the resulting map.
"To me, two things were most revelatory," said Kratz. "First, the sheer amount of novel proteins in the map. About 50% of the proteins included in the map following our data-driven paradigm were not included in the literature-curated maps considered here, justifying a data-driven approach to building the map.
"Second and related to that, membership to DDR is not a binary affair, but takes place on a continuum (and we quantify this continuum), extending to stress, transport, and chromatin functions."
The researchers have created interactive software that will enable other scientists to investigate proteins and DDR interactions of specific interest. Kratz said scientists can also use the map as a component in visible machine learning systems that potentially could illuminate larger questions, such as how DDR is relevant in the transition from genotype (the genetic constitution of an individual organism) to phenotype (characteristics of an individual resulting from interaction of its genotype with the environment). For example, how drug or toxin exposure might change the DDR.
Co-authors include: Minkyu Kim, Maya Modak and Nevan J. Krogan, UC San Francisco; Mark Kelly, Fan Zheng, Keiichiro Ono, Yue Qin, Christopher Churas, Jing Chen, Rudolf T. Pillich, Jisoo Park, Rachel Collier, Kate Licon and Dexter Pratt, all at UC San Diego; Christopher A. Koczor, Jianfeng Li, University of South Alabama; Robert W. Sobol, Brown University.
Kratz, A., et al. (2023) A multi-scale map of protein assemblies in the DNA damage response. Cell Systems. doi.org/10.1016/j.cels.2023.04.007.