Sifting through the junk with ENCODE

By Helen Albert

The first large-scale results from the ENCylopedia Of DNA Elements (ENCODE) Consortium have shed light on possible functions of large areas of the genome previously thought of as "junk DNA."

Since 2007, when the pilot phase of the project was completed covering 1% of the genome, 442 researchers from 32 laboratories around the world have been working to map and understand the noncoding or junk sections of DNA that make up a large part of the human genome.

The work is ongoing, but this week the results of 30 papers - six published in Nature, 18 in Genome Research, and six in Genome Biology - show the results of an integrative analysis of over four million regulatory regions in the genome involving transcription, transcription factor association, chromatin structure, and histone modification that have been mapped as part of ENCODE.

"We now have a parts list of what makes us human," Mark Gerstein (Yale University, New Haven, Connecticut, USA), author of one of the Nature papers, told the press. "What we are doing is figuring out the wiring diagram of how it all works."

A major ENCODE discovery, published in one of the Nature papers, is that despite only 1% of our genome encoding proteins, over 80% is involved in at least one biochemical RNA- or chromatin-associated event in at least one type of cell.

In addition, the scientists discovered that many regulatory elements are physically associated with one another and with expressed genes. More specifically, 95% of the genome lies within 8 kb of a DNA-protein interaction, and 99% within 1.7 kb of a biochemical RNA- or chromatin-associated event.

The Consortium believes that findings from ENCODE should also help interpret data from genome-wide association studies carried out to pinpoint single nucleotide polymorphisms (SNPs) associated with diseases such as diabetes.

A problem of these studies has been that many SNPs associated with disease are not found within protein-coding regions of the genome. ENCODE demonstrates that many of these SNPs occur in areas containing a high number of noncoding functional elements, which suggests they may interfere with gene regulation rather than having a direct impact on protein coding per se.

"Until now, everyone has just been looking under the proverbial lamppost for the causes of disease," said senior ENCODE Consortium member Michael Snyder (Stanford University School of Medicine) in a press statement. "But 85% of variants identified through genome-wide association studies, or GWAS, lie outside these regions. Now we can greatly expand our studies to the rest of the genome."

To make the research easier to access, all the papers from the three journals are open access and are available through the Nature ENCODE website along with additional information about the project.

Licensed from medwireNews with permission from Springer Healthcare Ltd. ©Springer Healthcare Ltd. All rights reserved. Neither of these parties endorse or recommend any commercial products, services, or equipment.


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
You might also like...
Scientists use next-generation protein degradation technology to study CTCF's role in transcription