Scientists will be able to pinpoint genetic causes of human diseases faster than ever thanks to a powerful new discovery method unveiled by the Southwest Foundation for Biomedical Research (SFBR) and an international team of researchers.
In the Sept. 16 online edition of Nature Genetics , the team describes its method for isolating genes that are self-regulated, meaning they harbor variations that affect their own output, then rapidly narrowing in on genes that likely have a causal effect on a particular disease or disease trait. That approach makes it possible in many studies for researchers to quickly sift through the 25,000 genes in the human genome and see which ones should be the focus of follow-up investigations.
As proof of concept, the group recounts how it used this method to identify a gene – VNN1 – that regulates HDL, the “good” cholesterol, a finding with major implications for heart disease.
“We basically just zeroed in on the low hanging fruit,” said John Blangero of SFBR, who directed the study. “Instead of looking at all of the genes, we focused on the ones that strongly control their own outputs, and of those genes we then looked at the ones that correlate with disease risks. This approach narrows down the field of genes to target very quickly. While this has been done before on a very limited scale, the sheer power of our AT&T Genomics Computing Center, plus multiple generations of genetic data we have accumulated in the San Antonio Family Heart Study, allowed us to apply this method to a much larger number of study samples. No one has ever applied this method on an epidemiological scale before.”
Home to the world's largest parallel computing cluster dedicated to human genetic research, SFBR's AT&T Genomics Computing Center allows the Foundation's scientists to analyze vast amounts of complex genetic data at record speed.
The researchers already are following up with analyses of 60 other genes that appear related to HDL cholesterol, and they are applying the method toward gene discovery for other factors related heart disease, as well as diabetes, obesity, and cystinosis, a rare genetic disorder. They so far have found approximately 100 genes that appear related to diabetes.
“Although in this paper we show how we used the method to find a gene with a big influence on HDL cholesterol, we've begun applying this same approach to every disease that we work on and have obtained outstanding results,” said Harald Göring, an SFBR geneticist who is the lead author on the paper. “It's the biggest speed-up in discovery that we've ever experienced.”
Genes that exhibit major control of their own outputs are known as “cis-regulated” genes. The output of these self-regulated genes is primarily affected by DNA variations within the genes themselves. This means that, if a cis-regulated gene is found to be correlated with a disease trait, there is a greater likelihood of quickly identifying genetic variations that play a causative role.
“This paper represents a proof of principle for a rapid approach to discover genes directly involved in disease,” Blangero said. “The ability to pinpoint the cis-regulated genes not only speeds up the discovery process, but means that you immediately have a good target for drugs to treat those diseases that they influence.”
Blood samples from 1,240 participants in SFBR's ongoing San Antonio Family Heart Study provided the genetic material for the study detailed in Nature Genetics. That study includes approximately 1,400 members of 40 Mexican-American families in the San Antonio area, who are participating in a long-term investigation of the genetic determinants of heart disease, diabetes and obesity.
The researchers in this new investigation focused their analyses on lymphocytes (a subset of white blood cells) that had been obtained from the participants of the family study. Using newly available glass “chips” containing sensors for virtually all of the approximately 25,000 genes in the human genome, they measured the amount of messenger RNA, or mRNA, the output of genes that subsequently gets converted into the proteins that perform the genes' functions in the body.
In the next step, the researchers examined these gene expression patterns to identify the self-regulated genes, which come in slightly different forms that generate more or less messenger RNA.
“The expectation is that the more mRNA present, the more protein that will be made,” Göring said.
This was done using extremely computer-intensive statistical analyses that are geared towards locating where in the human genome regulatory DNA variants are located. The investigators found several thousand genes that are likely to harbor DNA variants within themselves that determine how much mRNA and ultimately protein is produced by a gene. This told them which genes were cis-regulated.
To demonstrate how genetic expression patterns can be used to speed up the search for disease-influencing genes, the researchers chose HDL cholesterol as an example. To identify those genes that influence a person's “good cholesterol” level, they statistically correlated the gene expression profiles with the variable HDL cholesterol levels in the San Antonio Family Heart Study participants.