In a recent study posted to the medRxiv* pre-print server, researchers defined a novel polynucleotide-based Distinctiveness metric to capture the proteome-level differences of emerging severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants and lineages against all previously identified ones.
The canonical concept of mutation relies primarily on the reference ancestral viral sequences that are invariant over time. From the perspective of viral evolution, the Distinctiveness of SARS-CoV-2 sequences based on a proteome-wide comparison could capture regional lineages exposure, i.e., the pressure to evolve new strains endowed with protein sequences to which communities have not previously been exposed.
As new SARS-CoV-2 lineages evolve, it is crucial to understand the determinants of competently fitter strains and rapidly detect potential variants of concern (VOCs) as that can aid in achieving robust pandemic preparedness.
Throughout the coronavirus disease 2019 (COVID-19) pandemic, new SARS-CoV-2 variants have evolved, harboring unique mutations, including deletions, substitutions, and insertions. Some of these variants are designated as VOCs by the World Health Organization; to date, five SARS-CoV-2 VOCs have been identified, including Alpha (B.1.1.7), Beta (B.1.351), Gamma (P.1), Delta (B.1.617.2), and most recently Omicron (B.1.1.529).
About the study
In the current study, researchers examined the evolution of the Delta variant in India versus Brazil and investigated whether Delta's Distinctiveness could be generalized globally. To this end, they computed mutational load, defined as the number of mutations away from the ancestral Wuhan-Hu-1 sequence. Likewise, they determined Distinctiveness, defined as average distances at the amino-acid level between that sequence and all sequences collected at least one calendar day before that sequence.
They correlated the average Distinctiveness of sequences in a set to the change in prevalence of the corresponding set, defined as prevalence (t+56 to t+84) - prevalence (t to t+28), where t is time, thus considering two 28-day time windows. They included only those countries (total 28) in the analysis having at least 100 sequences collected in each of the two 28-day time windows.
Further, they analyzed the Distinctiveness of BA.1 and BA.2 lineages of Omicron using the Omicron sequences from India and the USA.
The researchers obtained individual mutation-related data for each aligned SARS-CoV-2 sequence from the Global Initiative on Sharing Avian Influenza Data (GISAID) database. They included only complete- and high coverage-sequences from the GISAID data, except for Omicron lineages, as this filter would have excluded ~97% of complete Omicron sequences, compared with 27% for all other SARS-CoV-2 lineages.
They determined positions of amino acids relative to the ancestral Wuhan-Hu1 strain (used as a reference) and treated insertions as a single modification. Overall, the study analysis included 280 data points, spanning 71-time windows in 28 countries.
The study analysis showed that mutational load and the Distinctiveness of the Delta variant in India (where it first emerged in January 2021) were significantly higher than the other contemporary lineages. In Brazil, however, the Distinctiveness remained higher, while its mutational load attained levels similar to the other contemporary lineages. Subsequently, the Delta variant outcompeted the Gamma variant (predominant VOC in Brazil before the arrival of Delta) to become the dominant strain in Brazil.
The relative Distinctiveness of emergent SARS-CoV-2 lineages was associated with their competitive fitness (Pearson r = 0.67), as apparent by the change in the lineage prevalence over eight weeks. In some cases, even the same SARS-CoV-2 lineages had different Distinctiveness-contributing positions in different geographies. Contrastingly, the mutational load had a lower association with competitive fitness.
Expectedly, the Distinctiveness of Omicron lineages was significantly higher than contemporary sequences. Further, interestingly, the Distinctiveness values of the Omicron BA.1 and BA.2 sub-lineages were similar, suggesting that they might have similar levels of competitiveness. Moreover, the Distinctiveness for Omicron in the US was high in Idaho, thus warranting further investigation of determinants of sub-regional Distinctiveness within SARS-CoV-2 variants.
Discussion and conclusions
The Distinctiveness metric used in the current study shed light on two facets of SARS-CoV-2 evolution and how it helps virus fight against host immunity.
First, it reflects that new amino acids acquired via mutations compared to prior strains confers evolutionary benefit at the level of infection or replication; likewise, in-frame genomic insertions also generate distinctive amino acids with evolutionary benefits.
Contrary to this, high Distinctiveness implies the loss of amino acids present in previously circulating strains reflecting that viruses discard deleterious sequences, such as those recognized by host antibodies to evolve. The most striking example is the in-frame deletions in the SARS-CoV-2 spike (S) protein N-terminal domain (NTD) present around binding sites for neutralizing antibodies.
COVID-19 vaccination uses the S protein sequence from the ancestral Wuhan strain, substituted at two prolines (positions 986- 987). Thus, vaccine-derived host immunity (i.e., antibody and T cell responses against the ancestral S protein sequence) against SARS-CoV-2 is effective against most VOCs but gets substantially reduced against the Omicron VOC as vaccination exerts considerable evolutionary pressure on SARS-CoV-2. Similarly, natural immunity (due to prior infection) exerts an evolutionary pressure by conferring robust and durable protection against breakthrough infections.
Overall, the study highlighted that Distinctiveness is an important metric that holistically captures the ongoing combat between viral evolution and host immunity. More importantly, SARS-CoV-2 lineages most distinctive from both the ancestral strain and VOCs circulating widely or at high prevalence in a given geographic region were unlikely to be neutralized by host immune responses. They also had a high potential to drive future surges, thus indicating that Distinctiveness contributes to the overall competitive fitness of emerging SARS-CoV-2 variants.
To conclude, it could be beneficial to monitor the Distinctiveness of SARS-CoV-2 sequences to aid the global pandemic preparedness efforts.
medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.