In a new study, researchers present a "cautionary tale" about what may go wrong when using the fledgling science of proteomics to devise a diagnostic test for cancer.
In the February 16 issue of the Journal of the National Cancer Institute, researchers from The University of Texas M. D. Anderson Cancer Center detail why an experimental test intended to identify early ovarian cancer from a small sample of blood is unlikely to lead to a reliable clinical test right away.
After conducting repeated checks of the data that supported the test's effectiveness, the researchers say their findings indicate that claims about the experimental protein-based assay are not biologically plausible.
"We view this as a cautionary tale. If you are not careful with this new technology, whose quirks we don't fully understand, you can find results that may be due to something other than biology," says the study's lead author, Keith Baggerly, Ph.D., an associate professor in the Department of Biostatistics & Applied Mathematics.
He adds that this study "illustrates the need for researchers to set standards by which to conduct proteomics research," meaning that protocols involved in these investigations should be common across laboratories so that results from one lab can be verified by others.
"We are moving in that direction," he adds. "The technology being used to develop a variety of proteomic diagnostic tests is getting better and we are getting more reproducible results."
Researchers worldwide are excited about the notion of using protein "barcodes" to identify individual cancers before symptoms appear, but Baggerly and others maintain that the promise of this emerging field of proteomics has not yet been met due to the difficulty in finding complex, reproducible patterns of proteins.
According to Baggerly, that now appears to be the case with the experimental ovarian cancer test at issue, which was first proposed in 2002 by researchers in the journal, Lancet. That study reported dramatic results using mass spectrometry to search for a pattern of proteins in blood. In several sets of blinded samples, the test detected all patients who had ovarian cancer and only misdiagnosed three healthy individuals.
But among others who subsequently tested the study's data for reproducibility were Baggerly and his colleagues (Kevin Coombes, Ph.D., and Jeffrey Morris, Ph.D., from M. D. Anderson Cancer Center, and Sarah Edmonson, M.D., from Baylor College of Medicine). They say the issue centers on how the mass spectrometer data were analyzed.
A mass spectrometer is an instrument that can quantitatively measure the concentration of hundreds of proteins from a single sample. In short, it does this by using an electric current to propel ionized proteins toward a detector. The number of ions hitting the detector at each mass-to-charge ratio (also known as m/z ratio) is recorded to produce a protein spectrum. A peak in the graph of the spectrum represents a protein (the identity of which is often unknown).
The goal is to find a pattern of peaks that will distinguish between patients with cancer and those who are cancer-free. The authors of the Lancet study have reported different proteomic patterns in three separate data sets. Researchers at the State University of New York, Stony Brook, reanalyzed the data to look for reproducible patterns. In 2003, they reported finding a single pattern involving 18 peaks (or unidentified proteins) that could diagnose ovarian cancer across two of these data sets.
The M. D. Anderson researchers examined the quality of the data sets and concluded that this systematic protein pattern "is biologically implausible." Baggerly explains that discriminatory peaks appear to be spread across the entire m/z spectrum in the second data set, but "changes in protein expression associated with cancer should affect only a few specific peaks, not the entire spectrum."
Furthermore, the researchers say some of the protein peaks were found in regions of the spectra where, for technical reasons, mass spectrometry cannot be sampling proteins, and thus may be simply representing experimental "noise." Such results may come about from such procedural problems as incomplete randomization of samples, Baggerly says.
The study raises a question that is broader than the effectiveness of the experimental ovarian cancer test, the researchers say. "Are we going to be able to measure what we want to measure?" Baggerly asks. "Given issues in technology and a lack of standards, I think it will be a few years before we can know what works."