Computational pipeline for the early identification of emerging SARS-CoV-2 variants

Download PDF Copy

Add News Medical on Googleas a preferred source

By Pooja Toshniwal PahariaReviewed by Danielle Ellis, B.Sc.Aug 16 2022Revised

In a recent study posted to the medRxiv* preprint server, researchers developed a computational pipeline for the early identification of emerging severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of interest (VOI) by analyzing SARS-CoV-2 genome data and allocating risk scores on the basis of functional and epidemiological parameters.

*Study: Early Detection of Emerging SARS-CoV-2 Variants of Interest for Experimental Evaluation. Image Credit: CROCOTHERY/Shutterstock*

This news article was a review of a preliminary scientific report that had not undergone peer-review at the time of publication. Since its initial publication, the scientific report has now been peer reviewed and accepted for publication in a Scientific Journal. Links to the preliminary and peer-reviewed reports are available in the Sources section at the bottom of this article. View Sources

Background

The continual emergence of SARS-CoV-2 variants with enhanced immune-evasiveness, transmissibility, and replication warrants the need to monitor the genomic evolution of the virus. Early detection of SARS-CoV-2 VOIs could enable the prioritization of variants for experimental evaluation, risk assessment, and public health optimization against SARS-CoV-2.

About the study

In the present study, researchers developed a computational heuristic framework to rapidly detect novel emerging SARS-CoV-2 VOIs and prioritize them for wet-lab experiments.

Genomic data for each variant mutation were obtained from the global initiative on sharing all influenza data (GISAID), GenBank, and BV-BRC (bacterial and viral bioinformatics resource center) databases. The sequences were processed to identify high-priority VOIs for wet-lab experimentation. Variant prioritization was based on their epidemiological dynamics and their functional characteristics estimated based on the sequence prevalence scores, functional impact scores, and composite scores.

The framework ranked variant constellations (or covariates) for determining the mutational combinations to be evaluated, and the Omicron variant was detected for validating the computational approach. Genomes were aligned pairwise with the reference (Wuhan-Hu-1 strain) genome, and variant constellations were extracted mainly for SARS-CoV-2 S. Variants were categorized into geographic and temporal groups, and variant constellation counts and total isolate counts by date and region were used for computing spatiotemporal epidemiological dynamics viz. the monthly variants’ growth rates and prevalence rates.

Sequence prevalence scores were calculated from GISAID data of November 2021 (Omicron dominance period) for three most recent months for identifying epidemiologic parameters for scoring heuristics component of the pipeline to detect SARS-CoV-2 lineages that may raise concerns. Each country and month combination with >5% sequence prevalence or more than five-fold increase in growth rate from the previous month was assigned score 1. The scores were summed to obtain the final sequence prevalence score for all countries/month combinations.

Functional impact scores (FIS) were derived based on positional overlapping of SARS-CoV-2 S regions and by summing up the sequence features of concern (SFoC) scores. SFoC scores were calculated based on variant impact on replication, immune evasion, or binding to angiotensin-converting enzyme 2 (ACE2) receptors or monoclonal antibodies and variant neutralization by vaccination or previous infection. Composite scores (CS) were calculated by summing up the sequence prevalence scores (SPS) and functional impact scores. Emerging lineage scores were calculated from GISAIDA data between December 2021 and January 2022 by summing up scores of lineages with growth rates >15.

Results

The team identified 75 regions on SARS-CoV-2 S RBD that significantly impacted the binding of ≤4 antibodies and 36 regions with a significant impact on the binding of vaccine or convalescent sera antibodies. Twelve sites with ≥1 mutations exceeding the threshold (>0.1) were identified as indicative of enhanced ACE2 affinity, of which site number 501 was a site of multiple conformational changes in SARS-CoV-2 S RBD binding interactions with ACE2.

Important sites of adaptive immune responses and SARS-CoV-2 tropism were N-terminal domain (NTD) sites 14 to 20, 140 to 158, 245 to 264, site 614 of SARS-CoV-2 S, and sites 671 to 692 of cleavage of furin protein. Epidemiological data for Omicron showed low SPS but considerably high FIS and resultant high CS values. CS could also quantify slight differences in covariates of a single clade. BA.1 was the predominant Omicron lineage in December 2021 and showed the highest emerging lineage score.

By January 2022, Omicron lineages such as BA.1, BA.1.1, and BA.2 evolved with multiple covariates. BA.2 variant constellation was identical to Omicron BA.1 with multiple unique mutational sites. Mutant BA.1 (with R346K mutation) exhibited higher functional impact scores than Omicron BA.1. Contrastingly, many covariates showed sequence prevalence scores as 0, indicative of no significant threat by their growth changes.

Before January 2022, the N440K, G446S, L24-, R346K, A701V, and L452R mutations appeared sporadically, and mutation dynamics plotting showed that G446S and R346K mutations were less prevalent, whereas L24- was concomitantly more prevalent. The finding indicated a fitness advantage for variants containing L24- and could aid in distinguishing between BA.2 and BA.1.

Conclusion

Overall, the study findings highlighted a novel computational spatiotemporal framework for early detection of SARS-CoV-2 variants based on their sequence prevalence, mutation prevalence, and mutational impacts on SARS-CoV-2 functions such as binding with ACE2 receptors. There were a few challenges in framework development, such as ambiguity fluctuations in sequence data during Delta and Omicron variant emergence, accurate data quantification for computation, and analyzing data that is enormous and continually increasing.

Journal references:

Preliminary scientific report. Wallace, Z. et al. (2022) "Early Detection of Emerging SARS-CoV-2 Variants of Interest for Experimental Evaluation". medRxiv. doi: 10.1101/2022.08.08.22278553. https://www.medrxiv.org/content/10.1101/2022.08.08.22278553v1
Peer reviewed and published scientific report. Wallace, Zachary S., James Davis, Anna Maria Niewiadomska, Robert D. Olson, Maulik Shukla, Rick Stevens, Yun Zhang, Christian M. Zmasek, and Richard H. Scheuermann. 2022. “Early Detection of Emerging SARS-CoV-2 Variants of Interest for Experimental Evaluation.” Frontiers in Bioinformatics 2 (October). https://doi.org/10.3389/fbinf.2022.1020189. https://www.frontiersin.org/articles/10.3389/fbinf.2022.1020189.

Article Revisions

May 13 2023 - The preprint preliminary research paper that this article was based upon was accepted for publication in a peer-reviewed Scientific Journal. This article was edited accordingly to include a link to the final peer-reviewed paper, now shown in the sources section.

Posted in: Medical Science News | Medical Research News | Disease/Infection News

Comments (0)

Written by

Pooja Toshniwal Paharia

Pooja Toshniwal Paharia is an oral and maxillofacial physician and radiologist based in Pune, India. Her academic background is in Oral Medicine and Radiology. She has extensive experience in research and evidence-based clinical-radiological diagnosis and management of oral lesions and conditions and associated maxillofacial disorders.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Toshniwal Paharia, Pooja Toshniwal Paharia. (2023, May 13). Computational pipeline for the early identification of emerging SARS-CoV-2 variants. News-Medical. Retrieved on August 03, 2026 from https://www.news-medical.net/news/20220816/Computational-pipeline-for-the-early-identification-of-emerging-SARS-CoV-2-variants.aspx.
MLA
Toshniwal Paharia, Pooja Toshniwal Paharia. "Computational pipeline for the early identification of emerging SARS-CoV-2 variants". News-Medical. 03 August 2026. <https://www.news-medical.net/news/20220816/Computational-pipeline-for-the-early-identification-of-emerging-SARS-CoV-2-variants.aspx>.
Chicago
Toshniwal Paharia, Pooja Toshniwal Paharia. "Computational pipeline for the early identification of emerging SARS-CoV-2 variants". News-Medical. https://www.news-medical.net/news/20220816/Computational-pipeline-for-the-early-identification-of-emerging-SARS-CoV-2-variants.aspx. (accessed August 03, 2026).
Harvard
Toshniwal Paharia, Pooja Toshniwal Paharia. 2023. Computational pipeline for the early identification of emerging SARS-CoV-2 variants. News-Medical, viewed 03 August 2026, https://www.news-medical.net/news/20220816/Computational-pipeline-for-the-early-identification-of-emerging-SARS-CoV-2-variants.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.