Since the start of the COVID-19 pandemic, the causative virus, severe acute respiratory syndrome coronavirus (SARS-CoV-2), has undergone several mutations. This has led to significant variability of the virus within a single host.
Results of study. Image Credit: https://www.biorxiv.org/content/10.1101/2020.12.09.417519v1.full.pdf
A new study published in the pre-print server bioRxiv* describes the prevalence of single nucleotide variations within a single host, suggesting that 1) this phenomenon seems to be much more common in the Indian population, and 2) this may lead to the emergence of many variants in the population. 3) The variants may have different spike protein characteristics, leading to potential antibody resistance.
SARS-CoV-2 is an RNA virus, and as such was expected to show a high rate of mutation in its genome. Single-nucleotide variants (SNVs) lead to the formation of multiple quasi-species, with very similar but not identical genotypes.
Several mechanisms underlie this high mutational tendency, including host-mediated RNA editing by deaminases. This has been suggested to be a major cause of intra-host variability for this virus, as well.
RNA Editing in SARS-CoV-2
Two important RNA editing enzymes are the apolipoprotein B mRNA editing catalytic polypeptide-like and Adenosine Deaminase RNA Specific 1 enzymes, known as APOBEC and ADAR1, respectively. They are known to be activated in innate antiviral immunity for many viruses, including the coronavirus family.
While APOBEC enzymes deaminate cytosine to uracil on single-stranded RNA, causing a C-T transition, ADARs deaminate adenines to inosines on double-stranded RNA. Since inosine forms a base pair with cytosine, the next replication step will lead to guanine being incorporated in the complementary position, leading to an A-G transition.
If these enzymes act on the negative strand of SARS-CoV-2, the resulting changes will be G-A and T-C transitions, with APOBEC and ADAR1, respectively. In both cases, these small changes can have far-reaching effects on the secondary structure of RNA, its regulatory regions, protein structure and function, and the interactions between the virus and its host.
Of the many thousands of mutations that are possible, very few result in enhanced viral fitness, as by conferring immune evasion or drug resistance.
Considering the viral population within a host to consist of all intra-host quasispecies together, the fitness of the virus within this host will include the contribution of all the haplotypes. The current study was aimed at exploring intra-host single-nucleotide variations (iSNVs) to understand which points within the genome contribute to intra-host viral fitness in this manner.
The researchers used transcription data covering the whole of the transcribed RNA from over 1,300 viral isolates. These came from India (from multiple subgroups), China, Malaysia, Germany, the UK, and the USA. There were more than 86,000 iSNVs, at a median of 19 per sample, and they seem to be reliable.
They found widespread evidence of RNA editing by both enzymes, with ADAR1 activity appearing excessive in some cases. A-G and T-C substitutions, caused by ADAR1, made up about 36% of all variant positions, but in relatively few samples, especially the former.
Both resulted in synonymous and missense mutations to the same extent, but C-T or G-A changes also caused stop gain mutations. The latter can lead to a change in the amount of the functional protein product synthesized from the viral genome. The iSNVs brought about amino acid changes at almost equal frequencies, but many of them caused non-synonymous mutations and stop gain mutations.
Population-Linked Changes in iSNV Incidence
The incidence of iSNVs appears to be non-uniform with respect to the population studied. If so, these changes might reflect current editing activity within the groups studied. Indian isolates showed a significantly greater number of iSNVs across all subpopulations than European, Chinese, or US samples, despite comparable numbers of samples from all three countries. This difference could be due to genetic variation within populations shaped partly by the positive selection pressure of a heavy viral load.
Interestingly, malaria-endemic regions in India show an insertion allele for APOBEC3b which reduces the incidence of severe malaria, but these regions also have negligible COVID-19 mortality.
Though the correlation of APOBEC insertion with protection still needs to be tested, this nevertheless suggests the role of such family of enzymes in evolutionary outcomes of SARS-CoV-2 infection and the burden of disease.”
Effects of iSNVs on Viral Diversity
The scientists also found the sites where iSNVs seem to lock into genomic variations that persist as viral variants over time. Most of these genetic loci seem to be common to sets of viral isolates from anywhere in the world. This could mean they are preferred sites for these RNA editing enzymes.
The researchers point out that the A2a clade which is now the most common isolate, as well as the A3i/A4 clade which was unique to India and has now almost been replaced, both show variations in the frequencies of nucleotides at their defining positions. C-T or A-G mutations are seen in all defining variants of clade A2a, such as D614G.
Some of the samples of RNA seemed to be subject to a very high editing frequency, as shown by numerous allele variations at multiple sites. Analysis of these samples showed that in about a third, the change involved A-G, caused by high ADAR activity. Additionally, the spike iSNVs seem to affect functionally important residues, causing almost 1,500 protein-coding variants.
Functional Consequences and Implications
The most commonly altered variant was D614G, followed by Y91, I105, and D428. There were observable hyper-variable sites within the amino acid sequence of the spike protein, which seemed to give rise to multiple variants. The receptor-binding motif (RBM) appeared to be hyper-variable, for instance, 11/25 hyper-edited samples showed changes in specific residues at three particular positions. These could lead to antigenic changes, adaptations in biological function, and mutational escape from antibody neutralization.
Such changes have been observed in the case of an immunocompromised individual in whom the virus lingered for months, undergoing numerous mutations to produce a host of non-synonymous changes. These heavily favored the spike protein and the receptor-binding domain (RBD), which accounted for 57% and 38% of the changes though they make up only 13% and 2% of the genome, respectively.
The researchers offer a thought-provoking comment: “These observations substantiate that editing within hosts may lead to an evolved immune escape ability in some strains which may seem to be a case of reinfection in a host after weeks or months of the first incidence.”
Other important sites of alteration have been related to antibody resistance and immune escape, such as Q493K and N493K, respectively. These findings underline the existence of extensive cross-talk between the virus and the host cell. The effects of one such interaction, mediated by RNA editors, include rapid and functionally important changes in the SARS-CoV-2 genome.
This study highlights the need for capturing iSNVs to enable more accurate models for molecular epidemiology as well as for diagnostics and vaccine design.”
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.