A new study published on the preprint server medRxiv* in May 2020 has identified a link between Internet search patterns and local disease progression for COVID-19. It recommends the use of search data to track and predict COVID-19 spread as a complementary method to established public health surveillance methods.
Study: Internet Search Patterns Reveal Clinical Course of Disease Progression for COVID-19 and Predict Pandemic Spread in 32 Countries. Image Credit: Olya Gan / Shutterstock
"Aggregated de-identified Internet search patterns have been used to track a wide range of health phenomena, including influenza, MERS, measles, abortion and immunization compliance, and are a potential alternative source of information for surveilling pandemic spread," say the researchers.
The study is the first to suggest Internet searches as a predictor of disease spread. In this manner, it attempts to complement established methods by addressing development delays and capacity limitations of traditional diagnostic testing methods, including the gold standard RT-PCR test.
A few earlier studies attempted to correlate the two, but in smaller geographical samples and with fewer keywords. However, many found correlations between symptom-related searches and daily or short-term incidence of COVID-19 cases.
The study assessed Internet search patterns related to COVID-19 symptoms such as fever, dry cough, sore throat, and chills to determine if they were real-time predictors of local COVID-19 spread and clinical disease progression, in multiple languages across 32 countries and six continents.
Researchers found that the clinical progression of the disease follows that reported in medical literature - fever, dry cough, sore throat, and chills initially followed by shortness of breath approximately 5 days after symptom onset. Also, an increase in searches related to COVID-19 symptoms was seen as a predictor of reported COVID-19 cases and deaths, predicting epidemic trends 18.5 days in advance.
Limitations of laboratory testing
The research team pointed out that laboratory testing in an epidemic has several limitations, such as delays in development, quality assurance, manufacturing delays, and administration and processing delays, as well as the significant risk faced by individuals and health workers, and financial constraints. Thus, even at its best, laboratory testing is a "lagging indicator," researchers say. This requires alternative approaches to optimize data for policymakers and health officials.
"Accurate real-time surveillance of local disease spread is essential for effective pandemic response, informing key public health measures such as social distancing and closures, as well as the allocation of scarce healthcare resources such as ventilators and hospital beds," the paper said.
How they used Internet searches
The research team accessed searches on Google Trends and Weibo (a Chinese search engine) for common terms such as "fever," "cough," "dry cough," "chills," "sore throat," "runny nose" and "shortness of breath," as well as the general terms "coronavirus," "coronavirus symptoms" and "coronavirus test."
Native speakers were recruited to translate the terms into some Middle Eastern and European languages. For others, Google Translate was used, and search patterns confirmed that those terms were being searched for.
Results showed that there was a lag of approximately five days between "fever" and "shortness of breath." For "cough," too, the lag was five days, closely in line with the clinical progression reported in medical literature.
"As the pandemic begins to take hold in a country, people search for "coronavirus symptoms" and "coronavirus test", followed by initial symptoms "fever", "cough", "runny nose", "sore throat" and "chills", followed by searches for "shortness of breath" about 5 days after the search for initial symptoms," say the investigators.
Search trends for "fever" predicted a spurt in COVID-19 cases 18.5 days in advance, and COVID-19 deaths 22.16 days in advance. Searches for "cough" demonstrated similar trends. Correlations of search trends with confirmed infection rates were highly dependent on local testing capacity, which was responsible for some variation of data between countries.
Despite the difference in timing of outbreaks in each country, the relationship between the search terms and the number of reported cases and deaths remains similar across countries. This means the data is useful in tracking the course of the pandemic and predicting its spread before the availability of large-scale testing in each country.
As the pandemic continues to spread in several countries, search volumes continue to increase. This data can be used to assess the real-time spread and implement health measures in the absence of local testing, researchers say.
The method can also help to understand the stage of the illness and disease manifestations in each region, which is helpful in assessing local variations in the disease presentation in different conditions.
This method is not without its limitations, researchers warn, including lack of Internet access and infrastructure in some regions and communities, which may not support widespread searching; socio-economic, geographic, or other biases; and the possibility of searches out of curiosity, news coverage, or related to other diseases like influenza. Also, search data is only provided as aggregated search volumes and not as actual numbers.
Non-symptom specific terms like "coronavirus" were less likely to predict disease progression accurately and infection trends, researchers said, as opposed to specific symptom-related searches.
The research team plans to incorporate machine learning to create a predictive model to estimate the number of COVID-19 cases and deaths in a more detailed form, taking into account other variables like news reports, testing capacity, mitigation measures, and weather-related variables.
medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.