A team that includes researchers from the National Institutes of Health (NIH) has found a new way of detecting functional regions in the human genome. The novel approach involves looking at the three-dimensional shape of the genome's DNA and not just reading the sequence of the four-letter alphabet of its DNA bases.
In a paper published in the early online edition of Science, a team led by Elliott Margulies, Ph.D., of the National Human Genome Research Institute (NHGRI), and Thomas Tullius, Ph.D., of Boston University, described an innovative approach for detecting functional genomic regions. By combining chemical and computer analyses, the researchers are able to survey the landscape, or topography, of DNA structure for areas likely to play a key role in biological function.
The method involves identifying all of the grooves, bumps and turns of the DNA that make up the human genome and then comparing those structural features to those seen in the genomes of other animal species. Structural features that have been preserved across many species are likely to play important roles in how the human body functions, while those that have changed over the course of evolution may play a less central role or no role at all.
"This new approach is an exciting advance that will speed our efforts to identify functional elements in the genome, which is one of the major challenges facing genomic researchers today," said NHGRI Scientific Director Eric Green, M.D., Ph.D. "Coupled with continued innovations in DNA sequencing, this topography-informed approach will expand our ongoing efforts to use genomic information to improve human health."
The sequence of the 3 billion DNA base pairs that make up the human genome holds the answers to many questions pertaining to human development, health and disease. Consequently, much research aimed at understanding the genome has focused on establishing the information encoded by the linear order of DNA bases. In the new study, however, researchers focused on how those bases chemically interact with each other to coil and fold the DNA molecule into a variety of shapes.
"We often think of DNA as a string of letters on a computer screen and forget that this string of letters is a three-dimensional molecule. But shape really matters," said Dr. Margulies, who is an investigator in NHGRI's Genomic Technology Branch. "Proteins that influence biological function by binding to DNA recognize more than just the sequence of bases. These binding proteins also see the surface of the DNA molecule and are looking for a shape that allows a lock-and-key fit."
In 2003, an international team of researchers finished a reference sequence of the human genome, an achievement that greatly sped efforts to find genes, which reflect the approximately 2 percent of the genome that codes for proteins. At one time, the remaining 98 percent of the genome was referred to as junk DNA. Researchers now know that this non-coding DNA contains elements that carry out important biological functions, such as turning genes off or on. However, little information exists about where these non-coding functional elements are located and how they work.
The new approach to identifying functional elements in non-coding DNA builds upon the individual efforts of Dr. Tullius, a chemistry professor who has spent more than 20 years developing methods to examine the 3-D structure of DNA, and of Dr. Margulies, a molecular biologist who uses computer methods to compare the genomes of different species.
"We brought together two diverse fields to think about this problem in a new way," said Dr. Margulies. "It took the combined expertise of a DNA chemist and computational biologist to figure out that this chemical technique could advance our understanding of comparative genomics."
"By considering the three-dimensional structure of DNA, you can better explain the biology of the genome," said Dr. Tullius. "For this achievement, Stephen Parker, a Boston University graduate student, deserves much of the credit for his development of the algorithm that incorporated DNA structure into evolutionary analysis."
In their Science paper, the researchers compared the topography of the human genome with that of 36 other mammalian species, including mouse, rabbit, elephant and chimpanzee. Using this topographic approach, they found that about 12 percent of the non-coding DNA in the human genome appears to be functionally important — twice the amount detected using methods that simply compared DNA sequences.
What accounts for the difference? Researchers say DNA sequence is not always a good indicator of function. They found that very similar DNA sequences may assume very different topographical shapes, which can have a major impact on their function or lack of function. On the other hand, different DNA sequences may assume very similar topographical shapes and perform very similar functions. So, in many instances, DNA structure may be a better predictor of function than DNA sequence.
The researchers went on to mine data organized by the PhenCode Project to see whether one-base variations in DNA sequence, called single-nucleotide polymorphisms (SNPs), in non-coding regions can cause structural changes that might lead to disease. Specifically, they conducted a topographic survey of 734 non-coding SNPs known to be associated with signs and symptoms of disease. The non-coding SNPs associated with disease tended to produce larger changes in the shape of DNA than a set of SNPs not linked to disease.
The entire study made extensive use of data sets generated by the NHGRI-funded ENCyclopedia of DNA Elements (ENCODE) project, which is a multi-institution effort to compile a parts list of the biologically functional elements in the human genome. In addition, some of Dr. Tullius's work in developing the new technology was funded through the ENCODE project.
For an artist's depiction of DNA packaging and topography, go to http://www.genome.gov/pressDisplay.cfm?photoID=20150.