An unprecedented amount of genomic sequence data is accumulating in real-time during the COVID-19 pandemic. Over 1.2 million sequences of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) have been generated during the past 15 months, and the scientific community has gained a lot of knowledge from these sequences.
Of all the mutations sampled so far, only a few became prominent in the viral population. Many of these mutations have emerged recently and in multiple lineages. This a textbook example of convergent evolution at the molecular level that generates curiosity and also acts as a motivation to study the basis for adaptive advantage driving these events.
A recent report, released as a preprint on the bioRxiv* server, by a team of researchers focused on the extent of convergent evolution in the SARS-CoV-2 spike (S) protein. The report confirms that top SARS-CoV-2 lineages of concern have the most convergent spike protein mutations. This indicates their fundamental adaptive advantage.
Analyzing the extent of convergent evolution in the SARS-CoV-2 spike (S) protein
The vast majority of spike protein sites - 21 out of 25 - under convergent evolution are tightly clustered in 3 functional domains – the receptor-binding domain (RBD), the N-terminal domain, and the Furin cleavage site. The study also shows that among the spike protein receptor-binding motif mutations, substitutions that boost ACE2 affinity are preferable.
The researchers write:
To monitor SARS-CoV-2 evolution, we briefly looked for convergent changes among all genes of the SARS-CoV-2 genome at NextStrain.”
While the mutation space analyzed in the spike protein had all amino acids reachable by single nucleotide changes (SNCs), substitutions that required two nucleotide changes or epistatic mutations of multiple residues have started to emerge only recently.
Vaccination efforts may shift the evolutionary pressure of SARS-CoV-2 towards immune-escape mutation at the expense of viral fitness
SARS-CoV-2, like other viruses, has the evolutionary pressure to increase virus fitness in a new environment. However, the global vaccination programs are expected to shift the pressure towards immune-escape mutation, even at the expense of viral fitness.
The findings of this study show that of the vast number of mutations detected in SARS-CoV-2 genomes, only a few rose to high frequencies. Interestingly, many of these mutations display convergent evolution, which indicates a strong adaptive advantage granted by the specific mutations.
Most of the affinity-boosting mutations reachable by SNCs are already seen in plenty in the global genomic dataset. In contrast, the study identified only a single mutation - Y505W - with a relatively high representation and affinity-enhancing performance compared to the wild-type mutation that requires 2 nucleotide changes. Many mutations requiring 2 nucleotide changes with tighter binding to ACE2 have not been sampled so far.
Convergent evolution of the spike protein can help produce more effective universal 2nd-generation vaccines to protect the global population
The 2 nucleotide changes sampled so far show a drastic increase in the wealth of these advanced mutations during the last 3 months, although still with very low frequencies. This offers an explanation as to why epistatic mutations requiring orchestrated changes in several nucleotides in the protein are still a rarity.
The predictability of convergent evolution of the spike protein can increase the odds of the spike protein sequences in the universal 2nd-generation vaccines, effectively protecting the worldwide population from current and future viral variants of concern.
Despite the physical association and convergent emergence of these adaptive mutations, they are not well understood. The researchers aim to promote research of currently circulating variants that are understudied and may become variants of concern in the future.
The team concludes:
Our analysis shows that among the vast number of mutations which have been detected in SARS-CoV-2 viral genomes, only few rose to high frequencies.”
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.