A detailed study published on the preprint server bioRxiv* in July 2020 argues that recombination accounts for the appearance of the ability of both SARS-CoV and SARS-CoV-2 to use the human angiotensin-converting enzyme (hACE) 2 as the binding receptor mediating viral infection of human cells. This helps understand how this common trait is present in both otherwise diverse viruses. It also emphasizes the need to watch out for new outbreaks of human pathogenic sarbecoviruses in south-western China and Africa, among other regions.
SARS-CoV and SARS-CoV-2 are Quite Different
The current COVID-19 pandemic is caused by the viral agent severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Despite the similarity of the name of this coronavirus with that of the earlier SARS pathogen, namely, SARS-CoV, the two are not in the same line of descent. Scientists are thus suspicious of the group of coronaviruses (CoVs) in the subgenus Sarbecovirus that includes both these viruses as well as many bat viruses and a few pangolin viruses, as a potential source of many new pathogens.
Both the SARS viruses cause infection by actions mediated by viral binding to the ACE2 molecule, and this means this particular trait is essential to arrive at a proper understanding of how sarbecoviruses became able to infect humans. There are bat SARS-like CoVs which are in a close phylogenetic relationship with SARS-CoV but cannot bind human ACE2. Conversely, another bat SARS-like CoV can.
ACE2 Binding Ability
The ability to bind human ACE2 is, therefore, traceable to the presence of a couple of amino acids in the receptor-binding domain (RBD) of the virus. The deletion of these residues causes significant structural differences that cause them to lose this ability. The question facing the researchers in the current study is since these are all closely related, how did some of them acquire this ability?
The current hypothesis is that Chinese horseshoe bats, which have been shown to harbor many different sarbecoviruses are the primary natural reservoir for this subgenus. These are also considered the primary host for the ancestor of SARS-CoV, as a number of very similar viruses have been isolated from them. Some scientists think that SARS-CoV acquired recombinant genomic regions from different SARS-like CoVs in the area in and near Yunnan province in China, and then leaped into humans, crossing species barriers.
Horseshoe Bat (Rhinolophus sp.) Image Credit: Hugh Lansdown / Shutterstock
The Spike Gene
One particular newly acquired region in the SARS-CoV is the spike gene, as shown by the presence of a breakpoint at the point where the ORF1b meets the spike. This spike has a markedly different genetic makeup compared to the spike proteins of other viruses in the same clade. The primary variation is in the large deletions in the RBD mentioned above. The recombinant region is considered to have been acquired from an unknown line of sarbecoviruses and made it possible for the SARS-CoV to emerge as a human pathogen. This event has been demonstrated to occur in the spike genes of other CoVs, which have then caused infection in humans and domestic animals, and recombination is a vital process in driving the emergence of all CoVs.
Structural modeling of sarbecovirus RBDs found in Uganda and Rwanda. (A) Structural superposition of the X-ray structures for the RBDs in SARS-CoV-1 (PDB 2ajf, red)  and SARS-CoV- 2 (PDB 6m0j, cyan)  and homology models for SARS-CoV found in Uganda (PREDICT_PDF-2370 and PREDICT_PDF-2386, purple) and Rwanda (PREDICT_PRD-0038, yellow). (B) Overview of the X574 ray structure of SAR-CoV-1 RBD (red) bound to hACE2 (blue) (PDB 2ajf, red) . (C) Close-up view of the interface between hACE2 (blue) and RBDs in SARS-CoV-1 (PDB 2ajf, red)  and SARS-CoV- 2 (PDB 6m0j, cyan)  and homology models for viruses found in Uganda (PREDICT_PDF-2370 and PREDICT_PDF-2386, purple) and Rwanda (PREDICT_PRD-0038, yellow). Labeled RBD residues correspond to residues whose identity is not shared by SARS-CoV-1 and/or SARS-CoV-2 (asterisks denote residues whose identity is not shared by any ACE-2 binding SARS-CoV as dictated by Figure 3). Labeled hACE2 residues correspond to residues within 5Å of RBD residues depicted.
Confined to One Habitat Range
The prerequisite for CoV recombination is that the different viruses have overlapping habitats, host species, and tissue tropism. Bat sarbecoviruses form three phylogenetic clusters by region, with one lineage each from south-western China (including SARS-CoV), one from other southern Chinese provinces, and one from central and northern regions. The same picture prevails in Africa and Europe. Despite the bats showing extensive distribution, the sarbecoviruses confine themselves to one geographic area for the most part, though they can and do infect co-occurring bat species easily.
The Fourth Lineage
The SARS-CoV-2 is highly identical at the level of the genome to a bat CoV, RaTG13, from Yunnan, and both bind human ACE2, with higher and lower affinity, respectively. After this discovery, another seven full-length viruses with high homology have been published, from Malayan pangolins. The closest match is with yet another bat CoV, RmYN02, also from Yunnan.
These four, therefore, are part of a fourth cluster, but they come from different geographic regions, unlike the other three. This could be a false appearance since the animal hosts were transported to these locations from their original region.
Recombination Produced ACE2-Binding Trait
The current study reports that it is possible that SARS-CoV and closely related viruses gained their ACE2-binding ability from recombination between the ancestor of SARS-CoV and a virus from the fourth cluster. This happened only if and because they shared the same geographic habitat.
Deletions appear to have occurred later that caused other closely related viruses to lose this ability. Again, the researchers show that three CoVs from Uganda and Rwanda have RBDs that occupy a middle ground between hACE2-using viruses and those that do not. These are 99% to almost 100% identical to each other. They are 76% and 74% similar to SARS-CoV and SARS-CoV-2, respectively.
All three are within the same clade, intermediate between the two SARS viruses, and maybe a cryptic species. They were all identified from the same host species.
A phylogenetic tree was constructed based on the RNA-dependent RNA polymerase (RdRp) gene, showing that they observe the earlier principle of geographic phylogeny, or occurring in the same geographic habitat. The European and African species cluster phylogenetically with each other. Meanwhile, the SARS-CoV-2 belongs to another clade containing viruses from Africa and Eastern Europe.
A second tree was built based on the RBD alone. This showed a closer relationship between all viruses using ACE2 with SARS-CoV-2 than with other viruses in the first cluster. However, the bat CoV RmYN02 moves to another clade with viruses that do not use ACE2. Within the clade of RBD similarity, viruses that do and do not use ACE2 are completely separated, with the African SARS-like viruses again occupying an intermediate but distinct clade, nearer the former clade.
Mechanisms of Loss of ACE2 Binding
In vitro assays of their ACE2 binding activity shows that they cannot use ACE2. Despite their structural similarity with both SARS viruses, they have a deletion of 2-3 amino acids in region 2, which is close to the ACE2 binding region. One of the deleted amino acids forms hydrogen bonds with ACE2 residues in both the SARS viruses. These deletions are shared by all other viruses that do not use ACE2.
Another feature is the presence of a lysine residue within region 3 in all African sarbecoviruses that makes contact with the hotspots in the ACE2 molecule and is essential for binding to the viral RBD. The SARS viruses have other residues at this position. In contrast, the lysine reduces the binding affinity by over 20 times by unfavorable electrostatic interactions with the ACE2 hotspots. Most non-ACE2 binders have a valine in position 487, which is hydrophobic and could change the protein arrangement at one of the ACE2 hotspots, preventing binding to the RBD.
Thirdly, the receptor binding ridge is not found in any non-ACE2 using virus but is present in the African sarbecoviruses. However, it has differences in the amino acids that set it apart from the SARS viruses. Structural differences in the ridge increase the binding affinity of SARS-CoV-2 over SARS-CoV, by compacting the loop. Similarly, changes in the amino acids forming the ridge could be responsible for hindering ACE2 binding by the African viruses. Thus, only those sarbecoviruses within the SARS-CoV and SARS-CoV-2 RBD clade can use human ACE2 for binding.
Stepwise Deletions Followed by Recombination
The deletions in region 2 and the receptor binding ridge seem to have been lost in two steps and not together. The researchers suggest that the region 2 deletions arose first before the African and European clades arose. The ridge deletion arose next, before the clusters 1, 2 and 3 arose, but is not seen in the African or European clades.
However, since the SARS-CoV and all the others in cluster 1 that use ACE2 occur within a clade of viruses that do not use it, and have both deletions, it appears that these would have to be reinserted to allow ACE2 usage again.
Selection pressure analysis also strongly supports the hypothesis that many residues in the spike protein, especially those at the interface of the RBD, where it engages the ACE2 receptor, are being positively selected. This includes all five amino acids that confer high binding affinity to SARS-CoV-2 relative to SARS-CoV. On the other hand, the non-ACE2 using viruses show purifying selection instead.
The study suggests that the most efficient way to achieve this difference in position in the RBD tree is via a recombination event, where a recent ancestor of the cluster 1 viruses that use ACE2 recombined with a virus in the lineage of SARS-CoV-2. It is possible that two separate events occurred since there are two related groups of RBD sequences. It could also be that recombination occurred so long back that two groups have since diverged noticeably.
Recombination at the Level of Ancestral Viruses in Both Lineages
Many previous studies show that while the SARS-CoV is recombinant, SARS-CoV-2 is not. However, ancestral viruses in both lineages could share the same space, as seen today. Thus, the study shows for the first time that the recombination of an ancestor of the former with a previously unknown ancestor of the latter, confirming that recombination is an important force allowing the sarbecoviruses to spill over into other species than the original host.
Implications and Future Research
The researchers also suggest that this change has led to the ability of two groups of viruses to use different receptors in the same hosts so that they can share the same location and the same host species. This means that antibodies against one will not cross-react or protect against the other one. This competitive release, if present, could imply a much greater diversity and geographical spread for the SARS-CoV-2 than is thought to be the case at present.
The researchers say, “Together, these findings help illuminate the evolutionary history of ACE2 usage within sarbecoviruses and provide insight into identifying their risk of emergence in the future. We also propose a mechanism that could explain the pattern of phylogeography across lineages 1, 2, and 3, and why lineage 4 viruses (including SARS-CoV-2 and its relatives) do not adhere to this pattern.”
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.