An innovative new statistical method, described in the open-access journal PLoS Biology, streamlines the computation required to identify all the potential locations in the genome that influence a particular physical trait, or phenotype.
Thanks to the new method developed by John Storey, Joshua M. Akey, and Leonid Kruglyak, researchers have a more efficient genome-mining technique to help them identify all the genomic elements that produce specific traits. In brewer's yeast alone, Storey and colleagues found that 37% of gene expression traits link to two loci, or positions in the genome.
"We were interested in being able to find combinations of genes that affect the phenotype," says Kruglyak. "It's generally thought that most traits of interest have a complex underlying genetic basis, but it's generally been pretty difficult to get at those." Typically, researchers might be able to find only one of the genetic factors, even though more than one genetic location contributes to the observed trait, such as blood pressure or cell growth.
The new statistical method bypasses the previously overwhelming computations needed to puzzle together the myriad elements that influence gene expression throughout an entire genome. And unlike earlier approaches to understanding how multiple loci interact, the new technique can distinguish between a group of genes with a linked subset and a group of genes with "joint linkage," where each gene site links to another.
"In some ways, it looks like you're complicating a problem because you're looking at thousands of genes instead of one trait," says Kruglyak. In reality, the method creates statistical conclusions that are more precise, he explains, because you're using so much data.
Storey et al. compared their method to another statistical method, called two-dimensional linkage analysis, which tests for linkage between all pairs of a large set of genomic marker sites. The authors found that two-dimensional analysis is not only more computationally demanding than their new method, but also generates ambiguous results because it can be difficult to distinguish whether one or both of the loci being tested are responsible for altered expression levels. This problem grows exponentially with each added test site. This approach also failed to reveal that hierarchical relationships between two genomic locations control about one in seven yeast expression traits¡Xwhich Storey et al. discovered using their method.
Although the group studied yeast, their method can be applied to more complex organisms to search for even larger numbers of linked loci and to provide insights into the many interlocking pathways that make up the gene regulatory network.