Quantification of SARS-CoV-2 variants in wastewater samples

In a recent study posted to the pre-print medRxiv* server, a team of researchers introduced a novel method to estimate the proportion of different severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) strains in wastewater samples.

Study: Estimating the relative proportions of SARS-CoV-2 strains from wastewater samples. Image Credit: Nhemz/ShutterstockStudy: Estimating the relative proportions of SARS-CoV-2 strains from wastewater samples. Image Credit: Nhemz/Shutterstock


The impact of the coronavirus disease 2019 (COVID-19) pandemic has brought to fore the urgent need to identify the inception of outbreaks, the transmission of SARS-CoV-2, and the real-time trends of COVID-19 spread.

An effective strategy for monitoring the early detection of SARS-CoV-2 in different population groups is wastewater-based epidemiology (WBE). SARS-CoV-2 excreted by both symptomatic and asymptomatic individuals can be effectively detected using WBE, making it an efficacious model for distinguishing the genetic characteristics of the SARS-CoV-2 virus.

About the study

The present study investigated a method to provide proportional estimates of SARS-CoV-2 variants in wastewater sampled from a community and to correlate the estimates with the number of COVID-19 cases in that community.

The researchers used an imputation approach to allow a like-to-like comparison of sequencing reads against the reference strains of the SARS-CoV-2 variants. This method of detection of SARS-CoV-2 sequence composition was based on the Tree imputation method. The Tree imputation method was compared to the Common allele imputation method to determine the efficacy of the methods by removing sequenced nucleotides in SARS-CoV-2 sequences, followed by re-imputation using either the Tree imputation or Common allele imputation.

To accurately determine the strain composition of SARS-CoV-2 variants, the researchers developed a new phylogenetic method that allowed data imputation for SARS-CoV-2 sequences. For each sequencing data set, the authors removed genomes with SNP alleles that had an allele frequency less than the frequency threshold. This elimination of genomes reduced the size of the alignment to less than 1000 relevant genomes. These relevant genomes were then used to calculate the number of mismatches between each genome and the sequencing read. The probability of sequencing read occurring in each strain was then calculated.

The researchers used the expectation maximum (EM) algorithm to estimate the proportion of different SARS-CoV-2 variants. The performance of the algorithm was evaluated by simulating several sets of sequencing reads containing varying numbers of SARS-CoV-2 strains. The estimated proportions of each strain were then evaluated and compared to the real values.


In the present study, the Tree imputation method was found to have error rates of more than

5 × 10-4 while the Common allele imputation method showed error rates of more than 0.02. Hence, the Tree imputation method was more accurate at reporting sequencing data. The error rates in the Common allele imputation method were due to heterozygosity. The Tree imputation method exhibited errors at sites that switch allelic states often, suggesting a high degree of homoplasy in these sites. It was also noted that the site with the greatest number of imputation errors also had a high proportion of sequencing errors.

According to the researchers, imputation is more accurate in the detection of SARS-CoV-2 than in the case of diploid organisms, due to the virus’s strong phylogenetic structure. The proportion of SARS-CoV-2 variants estimated using the phylogenetic imputation method is similar to the true proportions. However, the proportions of the true strains were over-estimated when the coverage of the sequences was low. For a higher depth of coverage, the proportions were found to be more accurate.

The phylogenetic method developed by the researchers showed high accuracy, with error rates comparable to the error rates of typical sequencing methods. The EM algorithm further exhibited the accuracy of the phylogenetic method in effectively estimating the proportions of SARS-CoV-2 variants in wastewater samples.


The phylogenetic method developed by the researchers enabled effective detection of SARS-CoV-2 proportions in wastewater samples, and also strengthened reference databases. This method of quantifying SARS-CoV-2 could also be used to correct sequencing errors by integrating it with algorithms for imputation-informed sequencing. The authors recommend the usage of an average sequencing depth of 1000X to achieve higher accuracy with this method.

Given the high transmissibility of COVID-19, tools for monitoring the spread of SARS-CoV-2 within a community should be implemented effectively. However, the presence of a wide variety of SARS-CoV-2 strains makes this process highly expensive. To address this, wastewater sequencing can be used as a cost-effective and accurate approach to monitor the spread of SARS-CoV-2 variants and curb the further spread of the virus.

*Important notice

medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:
Susha Cheriyedath

Written by

Susha Cheriyedath

Susha has a Bachelor of Science (B.Sc.) degree in Chemistry and Master of Science (M.Sc) degree in Biochemistry from the University of Calicut, India. She always had a keen interest in medical and health science. As part of her masters degree, she specialized in Biochemistry, with an emphasis on Microbiology, Physiology, Biotechnology, and Nutrition. In her spare time, she loves to cook up a storm in the kitchen with her super-messy baking experiments.


Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Cheriyedath, Susha. (2022, January 17). Quantification of SARS-CoV-2 variants in wastewater samples. News-Medical. Retrieved on May 16, 2022 from https://www.news-medical.net/news/20220117/Quantification-of-SARS-CoV-2-variants-in-wastewater-samples.aspx.

  • MLA

    Cheriyedath, Susha. "Quantification of SARS-CoV-2 variants in wastewater samples". News-Medical. 16 May 2022. <https://www.news-medical.net/news/20220117/Quantification-of-SARS-CoV-2-variants-in-wastewater-samples.aspx>.

  • Chicago

    Cheriyedath, Susha. "Quantification of SARS-CoV-2 variants in wastewater samples". News-Medical. https://www.news-medical.net/news/20220117/Quantification-of-SARS-CoV-2-variants-in-wastewater-samples.aspx. (accessed May 16, 2022).

  • Harvard

    Cheriyedath, Susha. 2022. Quantification of SARS-CoV-2 variants in wastewater samples. News-Medical, viewed 16 May 2022, https://www.news-medical.net/news/20220117/Quantification-of-SARS-CoV-2-variants-in-wastewater-samples.aspx.


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
You might also like...
Study reveals Omicron sub-lineages BA.4 and BA.5 escape neutralizing antibodies elicited in response to BA.1 infection