Virus recombination is a typical aspect of sarbecovirus evolution, in which genetic material from two genetically diverse parental lineages is
is merged into a viable descendent virus genome. During the evolutionary history of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), genomic investigations reveal recombination events among coronaviruses circulating in non-human species occurred. During the coronavirus disease 2019 (COVID-19) pandemic, there were signs of ongoing recombination among SARS-CoV-2 genomes assessed using a statistical framework.
Study: Emergence and widespread circulation of a recombinant SARS-CoV-2 lineage in North America. Image Credit: Lightspring/Shutterstock
SARS-CoV-2 genomes that are recombinant have been found in the UK at low frequency, with some showing evidence of forward transmission. One of these UK recombinants was designated as lineage XA, the Pango nomenclature system's first recombinant lineage.
As SARS-CoV-2 spreads worldwide, new lineages emerge and are tracked using the Pango dynamic hierarchical nomenclature system. A succession of lineages descended from B.1 was first found in North and Central America in late 2020 and early 2021.
National genomic surveillance programs in the United States of America, Mexico, and other nations in the Americas identified lineages B.1.627, B.1.628, B.1.631, and B.1.634. Their genomes were posted publicly on the GISAID database. The presence of genomic similarities between these and other lineages circulating in the region and elsewhere led to the hypothesis that recombination occurred during their emergence and spread.
A team of researchers from multi-national institutions examine the dissemination and evolution of these four lineages in this paper, as well as the potential that one or more recombination events played a role in their evolution.
These findings corroborate lineage B.1.628's recombinant origin and classification as a separate recombinant lineage with forward transmission circulating in various nations.
A preprint version of this study, which is yet to undergo peer review, is available on the medRxiv* server.
A total of 1950 sequences from lineages B.1.627 (n = 252), B.1.628 (n = 1391), B.1.631 (n = 181), and B.1.634 (n = 126) were analysed for their spatiotemporal distribution. The four lineages' sequences were gathered between July 8, 2020, and August 18, 2021, with most of the samples taken in 2021.
All four lineages were mostly sampled in North America (89.5%), either in the United States of America (USA) or Mexico. B.1.627 and B.1.631 were generally found in the United States, while B.1.634 was more frequent in Mexico. Recombination is likely to have occurred in the data set of this study, according to the findings.
The results can be explained by a single breakpoint in the alignment, according to the generic algorithm recombination detection (GARD) analysis, which shows that a model integrating this recombination event has a lot of support. GARD infers that the breakpoint occurred around position 21308 (a TTT codon), which corresponds to the signal peptide region at the spike protein's N-terminus (18 nucleotides downstream of the canonical sarbecovirus transcription regulatory sequence AACGAAC).
However, when different subsampling approaches are used, some variation in the results was observed; an analysis that excludes the B.1.634 lineage indicates a recombination breakpoint is inferred at position 22775-22778 (at a GAT codon) in the Spike protein reading frame, which is close to beta-sheet 3 and corresponds to amino acid 390D in the core area of the receptor-binding domain (RBD). The NSP6 deletions are placed on one side of the breakpoint, and the Orf3a deletions are placed on the opposite side of the breakpoint due to the recombination analysis.
The authors calculated pairwise genetic distances across the genomes of representative sequences (basal to the main clades) in reference to the Hu-1 reference genome to further investigate the genetic divergence of these lineages and the sequences near the root of the phylogenetic trees (specifically, B.1.631 minor and B.1.628 minor).
While mutations have accumulated in all lineages, B.1.631 minor's Orf1ab area displays the least divergence from Hu-1; all clades show peak genetic divergence between locations 21000 and 23000, except for B.1.628 major, which diverges from Hu-1 uniformly across its genome.
The authors discovered that B.1.628 major arose as a result of a recombination event between the B.1.631 major clade and lineage B.1.634, prompting its classification as a recombinant lineage under the Pango nomenclature convention. It is also proposed that the sequences classified as B.1.628 minor be examined and categorized as lineage B.1.
The dominance of lineage B.1.617.2 (VOC Delta) on a global scale appears unchallenged at the time of writing, but the remarkable expansion of B.1.628 in early and mid-2021 highlights the viability of a recombinant SARS-CoV-2 lineage and delineates yet another important function of active genomic surveillance programs.
These findings highlight the necessity of future research into the virus's recombination rate and potential and the drivers of such evolutionary processes.
medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.