High-throughput sequencing could allow detection of new SARS-CoV-2 variants in wastewater

As new variants emerge of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the pathogen causing the current coronavirus disease 2019 (COVID-19) pandemic, continue to circulate, early identification and sequencing have become a necessary part of research in this area. This virus is known to be shed from the nose, the lungs, the saliva, urine, and feces.

A new study, published on the medRxiv* preprint server, reports the use of metagenomics to detect the virus from wastewater samples, indicating community-level circulating variants that are not yet identified or are present in very low proportions on clinical databases.

Wastewater surveillance

Earlier studies showed that the detection of viral RNA in feces and urine is possible in COVID-19 patients as well as individuals with asymptomatic infection. This knowledge led to the use of wastewater in SARS-CoV-2 surveillance. This is not a novel application, with Wastewater-Based Epidemiology (WBE) being a valuable technique in epidemiological surveillance carried out on a broader scale.

Not only is wastewater a cheap and non-invasive source of samples containing the different viral variants in a community, but it yields real-time data on the strains of SARS-CoV-2 in circulation at the time of the study. This offers an important advantage in vaccine and antiviral development.

Massive sequencing helps identify new mutations

The maximization of the utility of this structure depends on the use of high-throughput sequencing, where multiple viral genomes are analyzed from a spectrum of clinical presentations. By detecting variants present at low frequencies in the population, as well as the number and type of variants in circulation at the current time, researchers can rapidly detect the emergence of a new variant or its importation into a population, and also detect polymorphic sites.

The current study includes 40 samples from 14 wastewater treatment plants (WWTPs), serving three different areas in Spain. The viral RNA was detected by reverse transcriptase-polymerase chain reaction, targeting the nucleocapsid or N gene, the envelope or E gene, and a region called IP4, on the RNA-dependent RNA polymerase (RdRp) gene.

Subsequently, the researchers sequenced the viral genomes recovered from the samples.

All samples belonged to clade 20A, characterized by the mutations C241T, C3037T, C14408T, and A23403G. However, variants were found in nucleotide positions carrying substitutions that define the virus clades. Two of the samples showed a mix of sequences, of clades 20A and 20C, at the position 25563.

With these two, the second defining clade 20C mutation could not be verified (C1059T), as it was not sequenced in one of the samples. In the second sample, it had low coverage. This made it difficult to verify the presence of the expected mutation here, despite the presence of the mixed sequences.

Substitutions and deletions detected by sequencing

After adjusting for the depth and quality of the readings, they found almost 240 nucleotide substitutions and six deletions, relative to the genome of the reference strain, SARS-CoV-2 isolate Wuhan-Hu-1.

There were over a hundred variant nucleotides in the ORF1a (open reading frame 1a) polyprotein, 67 in ORF1b, 21 in the spike protein, 13 in the membrane protein, 10 in the nucleocapsid gene, and others in other ORFs.

When strains containing these mutations were found in a sample along with sequences containing the nucleotide found in the reference genome, such samples were termed mixed.

With the membrane (M) protein, almost half of all mutations were non-synonymous substitutions, but in ORF7a and ORF10, all the substitutions were non-synonymous.

Of the 21 spike substitutions, 13 were non-synonymous. Ten of these corresponded to already known mutations, while three reflected novel amino acid substitutions. Seven of the 13 have not been reported in genomes from Spanish patients so far, though three amino acid changes have been identified to be found at low frequencies among genomes isolated from Spanish specimens.

Six nucleotide deletions were also reported, of which four were in the ORF1a, and one each in the spike and the ORF3a proteins.

What are the implications?

The study shows the potential importance of sequencing SARS-CoV-2 in sewage, in identifying new mutations and clades of the virus, as well as detecting the viral clades circulating in real-time.

The genomic sequencing of such strains could provide complementary information to the results of clinical laboratory testing. For instance, in the current study, three novel nucleotide substitutions were found in the spike gene in wastewater samples.

Another example is of the mutations found to be at low frequency in genomic reads from clinical specimens, but confirmed in the wastewater genomic sequencing results.

A limitation of the study is the variation in the coverage of the important genomic regions between different samples. This suggests that high-throughput sequencing efforts targeting a specific region, such as the spike protein, or the nucleotide positions that define a clade, would be more valuable in identifying and annotating the genomic variants, either unknown or detected at only low frequencies, within a region or community.

*Important Notice

medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:
Dr. Liji Thomas

Written by

Dr. Liji Thomas

Dr. Liji Thomas is an OB-GYN, who graduated from the Government Medical College, University of Calicut, Kerala, in 2001. Liji practiced as a full-time consultant in obstetrics/gynecology in a private hospital for a few years following her graduation. She has counseled hundreds of patients facing issues from pregnancy-related problems and infertility, and has been in charge of over 2,000 deliveries, striving always to achieve a normal delivery rather than operative.


Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Thomas, Liji. (2021, February 15). High-throughput sequencing could allow detection of new SARS-CoV-2 variants in wastewater. News-Medical. Retrieved on September 27, 2022 from https://www.news-medical.net/news/20210215/High-throughput-sequencing-could-allow-detection-of-new-SARS-CoV-2-variants-in-wastewater.aspx.

  • MLA

    Thomas, Liji. "High-throughput sequencing could allow detection of new SARS-CoV-2 variants in wastewater". News-Medical. 27 September 2022. <https://www.news-medical.net/news/20210215/High-throughput-sequencing-could-allow-detection-of-new-SARS-CoV-2-variants-in-wastewater.aspx>.

  • Chicago

    Thomas, Liji. "High-throughput sequencing could allow detection of new SARS-CoV-2 variants in wastewater". News-Medical. https://www.news-medical.net/news/20210215/High-throughput-sequencing-could-allow-detection-of-new-SARS-CoV-2-variants-in-wastewater.aspx. (accessed September 27, 2022).

  • Harvard

    Thomas, Liji. 2021. High-throughput sequencing could allow detection of new SARS-CoV-2 variants in wastewater. News-Medical, viewed 27 September 2022, https://www.news-medical.net/news/20210215/High-throughput-sequencing-could-allow-detection-of-new-SARS-CoV-2-variants-in-wastewater.aspx.


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
You might also like...
In Denmark, Omicron reinfections reveal ineffective post-COVID-19 immunity