What's the best way to get a statistically reliable sample of people who are hard to identify, such as illegal-drug users in large cities, itinerant jazz musicians, aging Manhattan artists and semi-professional storytellers?
Answer: Use a new "pyramid" sampling method developed by a Cornell University sociologist. The Centers for Disease Control and Prevention (CDC) will use the method to recruit injection drug users (IDUs) and measure their HIV-risk behavior in the 25 U.S. cities with the largest number of AIDS cases.
The sampling method, called respondent-driven sampling (RDS), combines "snowball sampling" (identifying a set of initial respondents, who recruit their peers into the study, and each new set of respondents then recruit their own peers) with a mathematical model that weights the sample to compensate for the fact that it was obtained in a non-random way.
"The statistical method enables researchers to provide both unbiased population estimates and measures of the precision of those estimates," explains Douglas Heckathorn, professor of sociology at Cornell. He developed RDS in 1997 for a National Institute on Drug Abuse HIV-prevention research project targeting drug users in several Connecticut cities. "When applied in a way that fits the mathematical model on which RDS is based, its results have proven to be unbiased for samples of meaningful size," he says.
RDS is already used by the CDC's Global AIDS Program to survey IDUs in Bangkok and IDUs and commercial sex workers in Vietnam. It is also being used by Family Health International, the largest non-profit agency in international public health, in more than a dozen countries and provinces, including Bangladesh, Myanmar (Burma), Cambodia, Egypt, Honduras, India, Kosovo, Mexico, Nepal, Vietnam, Pakistan, Papua New Guinea and Russia to study gay men, IDUs and prostitutes. The National Institute on Drug Abuse also uses RDS to survey IDUs in several cities in Russia.
RDS is now getting its first broad national use to survey IDUs as part of CDC's National HIV Behavioral Surveillance System. (The CDC is part of the U.S. Department of Health and Human Services.) Heckathorn notes that drug injection was related to about 15 percent of new HIV cases reported in the United States in 2002 and accounted for nearly 20 percent of new cases among women. "This new system will provide the first nationally comprehensive estimates of HIV risk behaviors among IDUs, thereby providing the detailed information needed to determine where new interventions will be most effective and where current interventions are working best," Heckathorn says.
Heckathorn notes that RDS improves upon standard probability sampling methods because it reaches members of the target population that would otherwise be overlooked. "For example, while drug injectors could be sampled from needle exchanges and from the streets on which drugs are sold, this approach misses many women, youths and those who only recently started injecting," says Heckathorn, who has published a study using RDS to sample jazz musicians in four cities, is now preparing to apply RDS to aging artists and to storytellers. He is completing a book on RDS while serving as a visiting scholar at the Russell Sage Foundation.
A similar problem faced pollsters during the recent presidential election. "Phone-based polls were not able to access voters who had abandoned land-based phones in favor of cell and Internet phones or voters who merely refused to be interviewed," Heckathorn notes. "What little was known about these inaccessible voters showed that they were not the same as other voters; for example, they tended to be younger."
Pollsters, he says, had no way of knowing how to adjust their estimates to compensate for voters they had missed. "Similarly, public health researchers had no reliable way before RDS to determine how those they could access through location-based sampling differed from those who were inaccessible."