During the second coronavirus disease 2019 (COVID-19) wave in Brazil, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) P.1 (Gamma) lineage has accounted for most of the genomes sequenced. The Gamma variant is considered one of the most relevant variants of concern (VOC) globally. Within the spike protein of the Gamma variant, there are ten non-synonymous mutations (K417T, N501Y, and E484K), which include three which are situated in the receptor-binding domain (RBD).
Study: Comparative genomics and characterization of SARS-CoV-2 P.1 (Gamma) Variant of Concern (VOC) from Amazonas, Brazil. Image Credit: cipta studio/ Shutterstock
A higher case fatality rate has been associated with the Gamma variant, and this trait may be related to its genetic background. It remains largely unknown how the unique set of approximately 35 amino acid substitutes originated that characterize this variant. Among all coronavirus types, reassortment of entire genome segments by “copy choice” recombinations is well described. Alternatively, within the citizens of northern Brazil, there is a high seroprevalence of anti-SARS-CoV-2 antibodies, which may indicate that strong selective pressure was the cause for the new lineage.
In this study, a multi-national team of researchers describe the full-length SARS-CoV-2 genomes of 44 clinical samples from Amazonas, Brazil, sequenced and analyzed by their group, and compare them to previously described Gamma variant from Brazil and worldwide. Tests for phylogenomics, recombination, phylogenetic analyses of spike and non-structural proteins from open reading frame (ORF) 1a, and the detection of selective pressure acting on these sequences, were performed to gain an understanding of the evolutionary forces driving the Gamma variant emergence and evolution.
A preprint version of this study, which is yet to undergo peer review, is available on the medRxiv* server.
A total of 44 SARS-CoV-2 genomes were successfully sequenced from samples taken from February to March 2021 from patients from the cities Manaus, Parintins, and Itatiacoara, all of which were of the P.1 lineage. Apart from the expected mutations within the Gamma variant, the authors also detected unusual and/or uncertain substitutions and deletions. The authors identified six nucleotide deletions of amino acid residues L189 and N188 from the spike protein after visual inspection and automated alignment in three sequences. In two of them, a substitution of R190S was used also detected.
However, by only observing the alignment data, it is not possible to rule out the incidence of an N188S substitution after deletion of the R190 and L189 residues, because in both cases, the codons are being mutated to a serine residue, which is the pattern commonly observed on a global initiative on sharing all influenza data (GISAID). In 22 genome sequences available on the GISAID database from India, Belgium, Germany, Egypt, Suriname, Greece, and the USA, there is a deletion of the adjacent positions, in addition to N188 and L189 deletions.
Five mutations identified in this study (P209H, N188del, T1066, A243/L244del, and A243/L244 double deletion) have their first occurrence in the Brazilian P.1 genomes in this study. Before February 2021, the mutations identified in this study were more commonly associated with the B.1.351 variant. Other lineages such as AY.4 and B.1.1.7 also possess the mutations P209H and T1066A, but from the data in this study, they only occurred in the Amazonas genome P.1 lineage.
Phylogenetic analysis performed on 4,952 P.1 and B.1.1.28 genomes with the forty-four genomes collected in this study revealed the formation of a P.1 group with B.1.1.28 sequences in the basal branch. An additional B.1.1.28 sequence was found in the P.1 group, which originated from Turkey. However, due to the presence of L18F, P26S, N501Y, T20N, E484K, and H655Y substitutions identified through genomic analysis suggests that this sequence likely belongs to the P.1 lineage. Of note, this B.1.1.28 genome is one of two from the GISAID database that presents the mutation N440K; the other is also from Turkey.
As of September 28, 2021, only p.1 and B.1.1.28 sequences had been selected for the GISAID database. However, six other genomes originating in Turkey were posteriorly reassigned as B.1, corroborating the formation of a basal clade with 94.2 and 100% branch, validated by SH-aLRT and ultrafast bootstrap tests.
The B.1.1.28 and P.1 genomes form a large monophyletic group validated by 96.7 and 99% of statistical support by the SH-aLRT and ultrafast bootstrap tests, which is evidence of the ancestor-descendent relationship between B.1.1.28, which is the ancestor, and P.1, which is the descendent. The forty-four sequenced genomes of this study were in the P.1 cluster, even with their locations in different subgroups along the tree.
The results from the study all suggest that the main driving force in the evolution of the P.1 lineage is selective pressure. This is due to the association of diversification of P.1 sequences, the absence of evidence for recombinations, the known phylogenetic consequences of some signature mutations, and the confirmation of positive selection acting on some sites.
medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.