Efficient and effective control of the pandemic requires complete epidemiological data that involves accurate and timely parameters of disease transmission, disease burden, and disease severity. Poor quality or insufficient data on intervention benefits, setting and activity-specific risks, and transmission mechanisms can reduce public health response equity, efficiency, and effectiveness.
Before the outbreak of the COVID-19 pandemic, the US Center for Disease Control and Prevention’s (CDC) pandemic strategy had suggested specific data requirements to manage epidemics brought about by novel respiratory viruses.
Certain essential parameters for epidemic transmission include the generation interval, the incubation period, the infectivity period, the clinical fraction, and the secondary attack rate. These parameters help to predict the intensity and extent of epidemic transmission. Moreover, they also help to determine strategies for contact tracing and isolation as well as for targeting particular settings and groups.
Measuring the severity of infections, such as infection-fatality ratios and infection-hospitalization, can help identify the social impact of epidemic transmission and the scope of control measures. Surveillance systems can provide early estimates of infection severity, while prospective cohort studies can provide more reliable information.
Robust passive and active surveillance systems can inform about the number of people hospitalized, infection-related deaths, as well as the prevalence and incidence of illness. However, these surveillance data are often disaggregated and unable to provide timely and accurate community mitigations as well as appropriate healthcare resources.
A new scoping study published in PLOS ONE aimed to analyze COVID-19 epidemiological research and epidemiological data for managing the pandemic in the United States. A scoping study involves the examination of the extent, range, research activities, or nature of evidence in a particular field to identify gaps in data.
About the study
The study involved an examination of the US CDC’s public website for the US government to estimate COVID-19 transmission, severity, disease burden, and surveillance indicators of infection. The evolution of these parameters at different time points in the pandemic was assessed using the Internet Archive Wayback Machine. Characterization of disease burden and scope of surveillance indicators of infection was carried out during November 2020 and 2021.
Thereafter, a scoping review was conducted of observational epidemiology studies with COVID-19-related outcomes. The scoping review protocol was based on the Preferred Reporting Items for Systematic Reviews and Meta-analysis Protocols (PRISMAP). PubMed was searched to identify eligible studies with preferable target outcomes. Information on the authors’ governmental affiliations, data source, study methods, study setting, analytic outcomes, study population, and data period for all included studies.
The studies were categorized as analytic or descriptive o the basis of their methods. The analytic studies were subcategorized into retrospective cohorts, ecologic prospective cohorts, case-control, or cross-sectional. The descriptive studies were subcategorized into incidence studies, cluster investigations, or case series. The primary data source of the studies was categorized into administrative program records, serosurveys, medical or vital statistic records, original field data, questionnaire surveys, and active or passive surveillance programs.
Finally, all analytic studies and descriptive incidence studies were analyzed for any of the following outcomes, secondary attack rate, reproductive number or growth rate, serial interval or generation time, incubation period, case or infection hospitalization ratio, seroprevalence, case, infection, or hospital fatality ratio, case status, the incidence of infection, predictors of disease severity, emergency department care, predictors of infection incidence, and death or hospitalization.
The results indicated that most of the estimates on transmission parameters were based on epidemiological studies that took place outside the US. No authoritative estimate was found for secondary attack rate, along with the application of the US CDC’s pandemic risk assessment tools. Characterization of disease severity was done using ratios of hospital fatality and infection fatality disaggregated by age. Information on infection-fatality was obtained from European data, while hospital fatality was obtained from the US CDC COVID-NET active surveillance program.
Weekly reports of national-level COVID-19 surveillance measures were available in April 2020 on a US CDC page titled ‘COVID View.’ Mortality data were obtained from the US National Vital Statistics System, while data on hospital admission was obtained from active hospital-based surveillance. Data on COVID-19-related deaths and cases disaggregated by race, gender, and age were available on the COVID Data Tracker page from August 2020. Estimates from national seroprevalence were obtained after August 2020, while hospital admission data were available after December 2020. Vaccine effectiveness, fatality rate, and cases for health care personnel were available on the COVID Data Tracker page in 2021.
The results also reported that a total of 283 studies met the inclusion criteria, out of which 61% were published in the US CDC's Morbidity and Mortality Weekly Report (MMWR). Most of the studies consisted of authors affiliated with a combination of US CDC, State, or public health authorities, while some were affiliated with the US CDC. Moreover, 70% of the studies were observed to utilize data collected before October 2020.
Out of the 283 studies, 180 were categorized as descriptive. One hundred twenty-eight of the descriptive studies estimated a cluster or series of COVID-19 infections within a particular setting or sub-population. Social gatherings, prisons, and long-term care facilities were reported to be the most common places for descriptive studies. The most commonly reported sub-populations included adolescents, children, staff, and Congregate facility residents. Additionally, 51 out of the 180 studies were reported to be incidence studies that provided estimates of seroprevalence, cases, ED visits, disease, and incidence of mortality.
Out of the 283 studies, 103 were reported to be analytic, comprising mostly retrospective, ecologic, and cross-sectional designs. Common data sources for analytic studies included seroprevalence surveys, filed collected data, or passive surveillance systems. Most of the analytic studies assessed general populations in community settings. Nine analytic studies estimated secondary attack rates, two estimated serial intervals, two estimated reproductive numbers, three estimated excess mortality at different time points, and five estimated symptomatic fractions. Additionally, 25 analytic studies estimated population, periods, and place of incidence.
Sixty-six analytic studies assessed one or more predictors of confirmed COVID-19 infection. These were observed to vary by study and involved gender, age, race, the influence of behaviors, comorbidities, occupation, and housing status. Sixteen analytic studies indicated the variation in the incidence of infection due to differences in demographic characteristics, work location, income levels, vulnerability levels, school-related infections, and zip code education levels.
Many studies were found to examine risk factors among first responders and healthcare workers, with one evaluating serial testing of healthcare workers. Additionally, 23 analytic studies assessed predictors for severe disease outcomes, 2 assessed the impact of variants on disease severity and hospitalization, and 1 assessed the risk of in-hospital complications for COVID-19 patients compared to influenza patients.
Therefore, the current study demonstrates that the public health authorities in the US have failed to collect timely and complete epidemiological data to control responses against the COVID-19 pandemic. The US public health agencies must identify the reasons for such gaps in data and plan for a prioritized, strategic, and timely collection of epidemiological data as well as for carrying out epidemiological research to prevent future infectious disease epidemics.
The current study has certain limitations. First, the study was limited to only governmental public health data and research, whereas academic, clinical, and other private U.S. institutions could also contribute to COVID-19 data and research. Second, the study did not include unpublished agency analyses or pre-prints as well as missed several relevant published reports. Third, no examination or judgment of data or study quality was conducted.