COVID-19 research made easier with outbreak.info

In a recent paper posted to the bioRxiv* preprint server, researchers reveal the development of an open-source database that provides data on coronavirus disease 2019 (COVID-19) and severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) resources.

Outbreak.info: A standardized, searchable platform to discover and explore COVID-19 resources and data. Image Credit: Studio.c/ Shutterstock

Outbreak.info: A standardized, searchable platform to discover and explore COVID-19 resources and data. Image Credit: Studio.c/ Shutterstock

With the ongoing COVID-19 pandemic causing devastation on a global scale, scientists and public health systems alike have been working together to address the challenges the pandemic entails and develop policies to control it.

Since the pandemic began, scientific research has grown exponentially at an unparalleled pace, from exploring and testing therapeutic drugs to developing vaccines against SARS-CoV-2. Data suggests that over 52,000 peer-reviewed articles were published during the first year of the COVID-19 crisis, as compared to around 1,000 during the initial 12 months of the SARS outbreak in 2002.

The staggering magnitude of research data on COVID-19 and SARS-CoV-2, which continues to expand, requires a combined database to house the research data from across various available repositories in a standardized, searchable, interpretable, and easy-to-access interface.

The pandemic has led to the creation of several databases and for instance, numerous websites report COVID-19 cases across different geographical regions that are mostly contributed by volunteers.

LitCovid is a hub of the COVID-19 literature, while the data on clinical trials are stored at the National Clinical Trials (NCT) registry. Therefore, a common library that provides access to COVID-19 resources assembled from various sources is required to aid scientific research.

In the present paper, the authors describe the development of outbreak.info. This website hosts COVID-19 research data created by collecting metadata from 14 repositories and combining COVID-19 resources from hundreds of sources scattered over the internet and yet remain disparate.

The database hosts data resources from over 200,000 publications, clinical trials, and other related datasets. The collected resources were standardized by developing schema, prioritizing five classes of COVID-19 research data – publications, datasets, clinical trials, analysis, and protocols.

Number of resources in outbreak.info as a function of date.

Number of resources in outbreak.info as a function of date.

Metadata is ingested into the website in two ways. For example, the first method uses the BioThings software development kit (SDK) data plugins, and the second method allows submissions via an online form. A nested list of thematic or topic-based categories was developed based on the initial list from LitCovid, which resulted in a list with 11 broad categories and 24 specific child categories. Epidemiological data was ingested from John Hopkins University (JHU) and the New York Times (NYT), and the genomics data was integrated from the GISAID database.

Findings

After developing the schema, the researchers created data plugins or parsers to import metadata from 14 repositories and ingest it into outbreak.info. These parsers auto-update daily to maintain updated information. The most extensive data class was publications collected from LitCovid and the preprint servers, bioRxiv and medRxiv. The clinical trial data from the NCT and World Health Organization (WHO) formed the second largest library. The "protocols" class compiled data from two resources - Protocols.io and NCT protocols, while the datasets library sourced its information from Zenodo, protein data bank (PDB), Figshare, and Harvard datasets.

A. Distribution of resources by resource type and source. B. Heterogeneous and filterable resources (ie-publications, clinical trials, datasets, etc.) resulting from a single search of the phrase “Delta Variant”

A. Distribution of resources by resource type and source. B. Heterogeneous and filterable resources (ie-publications, clinical trials, datasets, etc.) resulting from a single search of the phrase “Delta Variant”

Data available at the Imperial College of London (ICL) were imported to fill the "Analysis" library class. The database has been developed with a feature to allow submissions from the "volunteers" or the community. Other features include creative and interactive visualization of epidemiological data imported from JHU and NYU, although many other sources compile information on epidemiology from JHU, the interface on outbreak.info is built to support research.

Conclusions

The authors of the present work have created a database to access resources of COVID-19 and SARS-CoV-2 easily. The massive expansion of research and epidemiological data necessitates a shared library that houses information from many sources in an easy, searchable, standardized, and interpretable interface. This has been achieved by creating outbreak.info, a feature-rich website that allows contributions from the community. Furthermore, the integration of data compiled from various repositories into a single database allows quick exploration and retrieval of COVID-19 resources irrespective of their source.

In summary, the authors created a website that essentially comprises three components: 1) outbreak.info contains a searchable interface, 2) a tool to explore epidemiology data and spatiotemporal trends, and 3) surveillance reports on SARS-CoV-2 variants and mutants. The website is also integrated with public application programming interfaces or APIs to allow access to resource data.

What is Outbreak.info? The Open-Source Hub of COVID-19 Data & Research

*Important notice

bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information

Journal reference:
Tarun Sai Lomte

Written by

Tarun Sai Lomte

Tarun is a writer based in Hyderabad, India. He has a Master’s degree in Biotechnology from the University of Hyderabad and is enthusiastic about scientific research. He enjoys reading research papers and literature reviews and is passionate about writing.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Sai Lomte, Tarun. (2022, January 25). COVID-19 research made easier with outbreak.info. News-Medical. Retrieved on May 16, 2022 from https://www.news-medical.net/news/20220125/COVID-19-research-made-easier-with-outbreakinfo.aspx.

  • MLA

    Sai Lomte, Tarun. "COVID-19 research made easier with outbreak.info". News-Medical. 16 May 2022. <https://www.news-medical.net/news/20220125/COVID-19-research-made-easier-with-outbreakinfo.aspx>.

  • Chicago

    Sai Lomte, Tarun. "COVID-19 research made easier with outbreak.info". News-Medical. https://www.news-medical.net/news/20220125/COVID-19-research-made-easier-with-outbreakinfo.aspx. (accessed May 16, 2022).

  • Harvard

    Sai Lomte, Tarun. 2022. COVID-19 research made easier with outbreak.info. News-Medical, viewed 16 May 2022, https://www.news-medical.net/news/20220125/COVID-19-research-made-easier-with-outbreakinfo.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post
You might also like...
SARS-CoV-2 Omicron infection boosts Delta immunity in those who have been vaccinated