Several patients developed pneumonia of an unknown cause in late 2019 in Wuhan, China. Deep metagenomic sequencing of bronchoalveolar lavage fluid samples led to the identification of a new severe acute respiratory syndrome (SARS)-like coronavirus.
Ultimately, this novel virus was designated as SARS coronavirus 2 (SARS-CoV-2) by the International Committee on Taxonomy of Viruses, while the disease caused by the virus was named coronavirus disease 19 (COVID-19) by the World Health Organization (WHO).
Study: Evolution of the SARS‑CoV‑2 genome and emergence of variants of concern. Image Credit: Naeblys / Shutterstock.com
Previously, six human pathogenic coronaviruses (hCoVs) have been discovered, of which include NL63, 229E, OC43, HKU1, SARS-CoV, and the Middle East respiratory syndrome coronavirus (MERS-CoV). Among these viruses, SARS-CoV, and MERS-CoV were highly pathogenic.
SARS-CoV first emerged in China in 2002, while MERS-CoV emerged in Saudi Arabia in 2012. The transmission rate of SARS-CoV-2 is higher as compared to SARS-CoV and MERS-CoV, thus resulting in the global pandemic. The emergence and diversity of coronaviruses can be attributed to recombination, tissue tropism, and adaptation of the virus to a new host species.
A review article published in the journal Archives of Virology discusses the evolution of SARS-CoV-2 and the emergence of variants of concern.
Emergence and expansion of SARS-CoV-2 clades
Shortly after the SARS-CoV-2 genome from the first infected individual was sequenced, thousands of whole-genome sequences of the virus were submitted to public databases such as GISAID. Additional bioinformatics platforms have enabled real-time tracking and visualization of sequence variations. The evolutionary rate of SARS-CoV-2 was found to be approximately 1.1 × 10-3 substitutions/site/year.
In April 2020, a survey involving 2,790 complete and high-coverage sequences, 13 distinct haplotype groups (H1-H13), and their associated daughter haplotypes were identified. Some of these haplotypes spread to various regions of the world, while others remained localized or eliminated.
The difference in the history of sequence variation plays an important role in the determination of viral fitness. This can be exemplified by haplotype H1, which is defined by co-segregation of four variations resulting in the p.D614G variation that was rarely found in March 2020 but became dominant by June 2020.
This variation reported a competitive advantage because of the enhancement of viral replication, viral load, and infectivity. At the molecular level, p.D614G causes an open conformation in the receptor-binding domain (RBD) of the SARS-CoV-2 spike protein that facilitates viral entry.
Several subclades such as GR, GH, and GV, have evolved from the G clade. The three-nucleotide substitution of GR causes p.R203K and p.G204R amino acid substitutions in the N gene. The 25563G>T variation in GH causes p.Gln57His in the protein encoded by orf3a, whereas the 22227C>T variation of the GV subclade causes p.A222V in the N-terminal domain (NTD) of the spike protein.
Vaccine development against COVID-19 began through companies and government-initiated programs soon after the emergence of SARS-CoV-2. It was believed that only vaccination could curb the ongoing pandemic by reducing viral transmission.
In one study involving symptomatic patients infected very early in the pandemic was followed up, the researchers observed that neutralizing antibodies were present at high positivity rates six months after recovery. However, multiple reports of SARS-CoV-2 re-infection in individuals who had only recently recovered from an earlier infection led to concerns for the health officials and scientists, as they raised questions about the duration of immunity and protection conferred by the vaccines.
The emergence of the SARS-CoV-2 variant of concerns
A variant of interest (VOI) is a virus strain that is associated with an increase in public-health-related parameters such as transmissibility, pathogenicity, therapeutic escape, the severity of clinical presentation, and antigenicity. Comparatively, a variant of concern (VOC) is defined as a strain associated with more-drastic changes in these parameters.
Five SARS-CoV-2 VOCs including the Alpha, Beta, Gamma [γ], Delta, and Omicron variants, as well as four VOIs including the Eta, Iota, Kappa, and Lambda variants have been recognized by the WHO as of late November 2021.
The Alpha variant was first detected in the United Kingdom in mid-December 2020. This strain of SARS-CoV-2 consisted of mutations in the spike protein that altered the key epitopes of the protein. The Alpha variant quickly spread to 90 different countries by December 2020 and became the dominant strain by March 2021.
The Beta strain was first identified in South Africa, whereas the Gamma and Delta variants were identified in Brazil and India, respectively. The Delta variant eventually became the dominant circulating strain around the world by May 2021.
Effectiveness of the current vaccines against VOCs
Recent studies suggest that the Alpha strain of SARS-CoV-2 has a significant transmission advantage as compared to earlier strains. Moreover, this variant leads to a higher viral load and a higher risk of infection in individuals under the age of 20.
However, the Alpha strain was found to be highly susceptible to neutralization by antibodies present in sera or nasal swabs from individuals who had recovered from infection with the original strain or from those individuals who had been vaccinated.
All five VOCs share a common p.N501Y mutation in the RBD of the spike protein. This mutation plays an important role in the increased infectivity and transmissibility of these strains.
The Gamma, Beta, and Omicron variants also contain the p.E484K-causing mutation in the same domain, whereas only the Gamm and Beta variants exhibit a change of lysine at position 417 to either asparagine or threonine. Therefore, reduced neutralization of the Beta and Gamma strains by antibodies in the sera from convalescent patients can be due to steric clashes and charge switches at antibody-binding sites due to the p.E484K mutation.
The Alpha and Delta strains both contain a variation at position 681 of the spike protein. This mutation leads to increased transmissibility of the virus. Additionally, the Delta strain also comprises a unique p.L452R variation in the RBD domain of the spike protein, while the Alpha strain contains a unique p.69-70delHV mutation in the spike protein.
The analysis of 50 selected genome sequences suggested that 32, 21, 35, and 20 sequence changes occur among the Alpha, Beta, Gamma, and Delta genome sequences, respectively. Additionally, all these sequence changes were usually, but not always, present together. Furthermore, four sequence variations that define clade G were shared among all the Alpha, Beta, Gamma, and Delta VOCs, which suggests that these VOCs are derived from this major clade.
Vaccine efficacy studies have suggested increased effectiveness of second-dose vaccines against the Alpha and Delta strains that further emphasize the importance of booster vaccination. Also, a G614 pseudovirus was found to be more susceptible to neutralization than D614-encoding viruses. This indicates that G614 is not an escape variation and the global dominance of G614 over D614 is not expected to impact the effectiveness of the current COVID-19 vaccines.
Potential sources of changes in SARS-CoV-2 genome
SARS-CoV2 sequence variability can be brought about by within-host recombination events, genome replication infidelity, host RNA-editing systems, and intra-host viral evolution in prolonged infections. Random mutations are responsible for most of the genomic mutations of SARS-CoV-2.
Currently, recombination is the best competitor of genome replication errors in the emergence of new SARS-CoV-2 variants. Recombination between SARS-CoV-2 requires co-infection with viruses with distinct sequences.
With increased frequencies of infections worldwide, the frequency of co-infection increases that, in turn, also increases recombination. However, co-infection and recombination only occur between locally or globally dominant strains.
Since the emergence of SARS-CoV-2, several variations in the genome of this virus have been identified. Mutations that confer an advantage to the virus are retained, while neutral mutations are eliminated.
Several frequent variations in the RBD of the spike protein result in the emergence of VOCs. These variations have an impact on infectivity, transmissibility, and immune escape.
Since the role of the spike protein is critical in SARS-CoV-2 infection, future vaccine development strategies should consider features of the spike proteins of the VOC strains. Furthermore, the continuous tracking of novel and clinically important sequence variations is of immense importance for public health, disease control, and the design of new preventive immunization strategies.
- Safari, I., & Elahi, E. et al. (2021). Evolution of the SARS‑CoV‑2 genome and emergence of variants of concern. Archives of Virology. doi:10.1007/s00705-021-05295-5.