Four years after publicly revealing the official draft human genetic sequence, researchers have reached the halfway point in dotting the i's and crossing the t's of the genetic sentences describing how to build a human.
The newly finalized chromosome 5 is the 12th chromosome polished off, with 12 more to go. As the new sequence reveals, this chromosome is a genetic behemoth containing key disease genes and a wealth of information about how humans evolved.
Chromosome 5 is the second of three chromosomes that the Department of Energy Joint Genome Institute (JGI) has finalized in collaboration with colleagues at the Stanford Human Genome Center (SHGC). The final sequence analysis will be published in the Sept. 16 issue of Nature.
"This extremely accurate sequence will be a powerful tool for scientists trying to understand human disease," said Secretary of Energy Spencer Abraham. "I'm pleased that the Department of Energy, which launched the human genome project in the mid-1980s, could help make this important contribution."
Lawrence Berkeley, Lawrence Livermore and Los Alamos national laboratory scientists and staff comprise the JGI, one of the world's largest and most productive public genome sequencing centers. JGI, in partnership with SHGC, completed the sequencing of three of the human genome's chromosomes--numbers 5, 16 and 19--which together contain some 3,000 genes, including those implicated in forms of kidney disease, prostate and colorectal cancer, leukemia, hypertension, diabetes and atherosclerosis. The chromosome 19 sequence was published in the April 1, 2004, issue of Nature.
"I am confident that the interesting features that we have identified from this sequence information are data that the research community can trust and put to good use," said Richard M. Myers, Professor and Chair of Genetics, who is also the director of the Stanford Human Genome Center.
Chromosome 5, the largest to be completed thus far, is made up of 180.9 million genetic letters – the As, Ts, Gs, and Cs that compose the genetic alphabet. Those letters spell out the chromosome's 923 genes, including 66 genes that are known to be involved in human disease. Another 14 diseases seem to be caused by chromosome 5 genes, but they haven't yet been linked to a specific gene. Other chromosome 5 genes include a cluster that codes for interleukins, molecules that are involved in immune signalling and maturation and are also implicated in asthma.
The spaces between the genes are as important as the genes themselves, said Eddy Rubin, JGI's director. "In addition to disease genes, other important genetic motifs gleaned from vast stretches of noncoding sequence have been found on Chromosome 5. Comparative studies conducted by our scientists of the vast gene deserts where it was thought there was little of value have shown that these regions, conserved across many mammals, actually have powerful regulatory influence."
These gene-free stretches were previously considered "junk DNA," but in recent years those seemingly barren regions have taken on greater prominence as researchers have learned that they can control the activity of distant genes. Some of the noncoding regions have also stayed remarkably consistent compared with those in mice or fish rather than accumulating mutations over the course of evolution.
"If you have such large human regions that stay conserved over vast evolutionary distances, it strongly supports the idea that they must contain something important," said Jeremy Schmutz, the informatics group leader at SHGC. Any mutation that appeared in those conserved regions was likely to have either killed the animal or made it less able to reproduce, preventing the mutation from making it to the next generation. So far, nobody has shown what role the conserved regions play. "What this says is that we don't know as much about this conserved stuff as we think we do," Schmutz said.
Hidden in the chromosome 5 sequence are clues to how humans evolved after branching away from chimpanzees. On average, the chromosome is more than 99 percent similar between chimpanzees and humans, with the greatest similarity found in genes that cause diseases when mutated.
Despite similarities in the overall sequence, the human and chimpanzee chromosomes compared have some structural differences, including one large section that is flipped backwards in humans compared to chimps. Such an inversion makes it impossible for the two chromosomes to pair up when the cell divides to create sperm and eggs. Over time, that incompatibility could have driven a reproductive wedge between the evolving populations.
Moving evolutionarily further away, about one-third of chromosome 5 is similar to a chicken chromosome that determines the chicken's sex, much like the X and Y chromosomes in humans. This finding backs up previous research suggesting that before mammals and birds split 300 million years ago, the sex chromosomes had not yet evolved. After the split, mammals and birds developed their own methods of creating males and females.
One duplicated region on chromosome 5 could eventually help explain how spinal muscular dystrophy is inherited. Researchers had known that deletions in the gene for survival of motor neurons, (SMN) caused the disease, but people with the same deletion can have much more or less severe forms of the disease. It turns out that the region contains many duplications and other rearrangements and varies considerably between people. Schmutz said that, with the sequence for this region in hand, researchers can now study how variations in the number of deletions or repetitions influences the disease severity.
For the chromosome 5 effort at JGI, Susan Lucas led the sequencing and Joel Martin the mapping and analysis efforts. Additional Stanford contributors included Jane Grimwood, the finishing group leader, and Mark Dickson, the production sequence group leader.