During the past 20 years, researchers have identified thousands of cell protein interactions, with the ultimate goal of inventorying all that occur within cells of various organisms - a comprehensive catalogue known as the interactome.
Such information will be critical to understanding the basic mechanics of cellular life, and how malfunctions in these processes contribute to cancer.
Unfortunately, the data collected by different teams of researchers has been somewhat inconsistent. One group's "map" of protein interactions in yeast cells, for example, may only partially overlap the map produced by another group. Because science depends on investigators' ability to reproduce and build on one another's work, such variability presents a considerable obstacle. The value of interactome maps -- and the potential of further research -- will be at issue as long as the accuracy and thoroughness of the underlying data is uncertain.
To recapture momentum, the field needs to be clear about the strengths and weaknesses of different methods of tracking protein interactions, researchers say, and reach a consensus on questions such as, How reliable is the data produced by different techniques? What portion of the interactome of different organisms has been mapped so far? Why do existing experimental techniques fail to detect certain interactions? What can be done to improve the quality of data collected?
In a series of four papers published in the January issue of the journal Nature Methods , investigators in Dana-Farber Cancer Institute's Center for Cancer Systems Biology (CCSB) start to answer those questions by examining the accuracy and thoroughness of current interactome maps and the techniques by which they are compiled. The studies -- in a special issue of the journal on the interactome -- provide a set of ground rules for future research and demonstrate the power of such research when backed by well-proven experimental techniques. The CCSB's director, Marc Vidal, PhD, is the senior author of the papers.
Framework for study
The first study, lead-authored by the CCSB's Kavitha Venkatesan, PhD, offers a framework for gauging the quality of current maps of the interactome in human cells. The maps draw on three sources of information about protein interactions: high throughput yeast two-hybrid (HT-Y2H) procedures, which use robotic equipment to screen thousands of proteins to see which bind to each other (the binding switches on a "reporter" gene that can be chemically detected); compilations of published studies on small numbers of protein interactions; and studies that predict interactions based on computational techniques. While each approach is useful, it isn't clear whether small-scale experiments provide better data than high volume screenings (as some studies have suggested), whether the interactions detected in experiments actually occur in living cells, and whether existing maps depict a small- or large-sized chunk of the entire interactome.
All experimental techniques generate some false positives -- in which interactions are "detected" that haven't really taken place -- and false negatives - in which interactions that have occurred fail to be found. To weed them out, the new framework examines experimental methods from the standpoint of precision, sensitivity, and completeness. "The framework approach takes as standards interactions reported in multiple studies of high quality, and then verifies those standards against results obtained by other techniques," says Venkatesan.
Using the framework, the Dana-Farber team found that each technique captures only 20-30 percent of all the interactions within cells. That led them to determine that the human interactome contains about 130,000 interactions, a small minority of which have been mapped so far.
The second study offers researchers a tool kit for determining whether a newly discovered interaction is indeed real, and not a false positive reading from a particular type of experiment. The kit is a set of four, high-capacity protein interaction tests that have been weighted in relation to a common set of benchmark data. When scientists identify two proteins as likely interactors, the pair can be tested in the tool kit to obtain a "confidence score" about whether they do, in fact, interact.
"This general approach will allow researchers to systematically and objectively assign confidence scores to all individual protein-protein interactions in cells," says lead author Pascal Braun, PhD. "Such a universally interpretable quality standard is critical for constructing accurate interactome maps."