The Phylogenetic Tree of the SARS-CoV-2 Virus

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a beta-coronavirus that is thought to be originally bat-borne before jumping to humans to cause the COVID-19 pandemic, with possible intermediate species.

Some mutations that have occurred to date have caused increased transmissibility, and one shows possible evidence of being more deadly. The virus has always been expected to evolve into different strains in a similar way to influenza, and this needs to be taken into account when thinking about treatment and vaccination.

batImage Credit: Rudmer Zwerver/

What is a phylogenetic tree?

Phylogenetic trees are diagrammatic representations of the evolution between different related species based on their genetic and physical similarities. At a broader level, phylogenetic trees originate from a ‘common ancestor’ giving rise to the multitude of life including bacteria, archaea, and Eukaryota.

Viruses are biological species, in the sense that they comprise nucleic acid sequences and are subject to a wide array of evolutionary changes including mutations to their sequences. Whilst viruses are not living organisms, the ability of viral nucleic acid sequences to accumulate changes through mutations or recombination with other species gives rise to novel viral lineages.

SARS-CoV-2 phylogenetic tree

The severe acute respiratory syndrome coronavirus (SARS-CoV-2) is the cause of the ongoing global pandemic of coronavirus disease-2019 (COVID-19) that originated around mid-December 2019 in Wuhan, Hubei Province of China. COVID-19 was declared a global pandemic by the World Health Organisation and spread across the globe.

SARS-CoV-2 is a lineage-b beta-coronavirus belonging to the coronaviridae family. This family belongs to the order nidovirales, of the pisonivirecetes class, of the pisuviricota phylum, of the orthomavirae kingdom, of the ribovaria realm. As such, the virus has an RNA genome (+ssRNA with single linear arrangement) with RNA-dependent RNA polymerase (RdRp) which produces RNA from RNA.

Lineage-b beta coronaviruses include the SARS-CoV virus that causes SARS and both bind to the ACE2 receptor. However, unlike SARS-CoV, SARS-CoV-2 contains an evolutionary distinct and proteolytically sensitive activation loop (furin-like cleavage site) that is thought to be the reason behind its increased pathogenicity and transmissibility. It has been shown to promote infectivity and cell-cell adhesion.

SARS-CoV-2 contains the furin recognition motif (PRRA) in its S1/S2 cleavage site. This site is not present in the currently recognized closest relative of SARS-CoV-2, the RaTG13 SARS-like bat coronavirus, but is similar to a site identified in pangolin coronaviruses, suggesting that the furin binding site could have occurred due to a recombination event. Research has shown that furin cleavage sites have naturally occurred many times in coronavirus evolution.

The origin of SARS-CoV-2 is considered to be bat-borne due to the close genetic similarity to bat coronaviruses (96%). There is no concrete evidence to suggest that another host was a reservoir for the virus before transmission to humans, although the virus shares up to 92% similarity to pangolin coronaviruses.

Some evidence has suggested that it may be that bat-borne SARS-CoV-2 jumped to pangolins, back to bats (incorporating some pangolin homology), and then to humans.

SARS-CoV-2 VirusImage Credit: Kateryna Kon/

Strain divergence of SARS-CoV-2

Analysis of the SARS-CoV-2 virus across different nations at different times of the pandemic has revealed that the virus has undergone several mutations, some of which do not seem to have any significant impact on the virus, whereas other mutations are thought to be more significant and allude to strain divergence.

Mutations can give the virus an advantage and enable it to become more virulent or it may actually lead to the virus becoming weaker than the current form.

It has been inevitable that as time progresses, the virus would accumulate independent mutations in different locations.

The specific mutations that have occurred in SARS-CoV-2 are mostly neutral, although some are allowing the virus to adapt more to the human host. One of the strongest early diversions occurred at site 11083 of Orf1a which encodes Nsp6. This is thought to be the site that results in CD4+/CD8+ T-cells. Changes within this region may account for the differences in immune responses to SARS-CoV-2.

Due to the zoonotic nature of SARS-CoV-2, it is next to impossible to predict the trajectory of the future phylogenetic diversity of the virus, and how it may adapt and evolve to infect humans in different ways.

As with influenza viruses; which have several divergent strains, SARS-CoV-2 has diverged into multiple strains with differing rates of valence and transmissibility. This is a concern for vaccine development, and current vaccines are being tested on multiple strains and preparing to adapt the vaccine to the new strains.

A D614G mutation early on in the pandemic (January/February 2020) altered the spike protein. This more transmissible strain quickly became the dominant strain of the virus, replacing the original strain and spread across the world.

Over the course of the pandemic, thousands of SARS-CoV-2 variants have developed, and these are divided into larger viral clades. There have been different nomenclatures proposed for these clades, for example, the GISAID seven viral clades: L, O, V, S, G, GH, and GR).

Current strains of concern

These lineages of SARS-CoV-2 are currently causing concern (March 2021) due to increased transmissibility and have spread rapidly around the world.


This strain was first identified in Britain and has since spread to over 90 different countries. The strain shows increased transmissibility due to mutations to its spike protein. There is also some evidence that the strain is more deadly. However, the strain is not thought to affect the efficacy of the current vaccines.


The B.1.351 strain was first detected in South Africa in October 2020 and has since spread to more than 48 countries. This strain also has mutations of its spike protein which increase transmissibility. Whilst research is underway, evidence is suggesting that antibodies gained from other strains and some vaccines have reduced efficacy when exposed to this variant.


The P.1 strain, which originated in Brazil, is another strain in which spike protein mutations have led to increased transmissibility. The strain is not thought to be more deadly, and more research is needed into the effects of this variant on vaccine efficacy.

What causes a virus to change and how to stop stronger Covid-19 variants from emerging


In summary, SARS-CoV-2 shares high homology to bat-coronaviruses and as such is thought to be bat-borne. The role of an intermediate host reservoir, thought to be pangolins, is still to be confirmed.

Numerous mutations have occurred within SARS-CoV-2 giving rise to distinct geographical variants, and a distinct few have given an advantage to the virus, increasing transmissibility and possibly virulence.


  • Forster et al, 2020. Phylogenetic network analysis of SARS-CoV-2 genomes. Proc Natl Acad Sci USA. 117(17):9241-9243
  • Jaimes et al, 2020. Phylogenetic Analysis and Structural Modeling of SARS-CoV-2 Spike Protein Reveals an Evolutionary Distinct and Proteolytically Sensitive Activation Loop. J Mol Biol. 2020 May 1; 432(10): 3309–3325.
  • Papa et al., 2021. Furin cleavage of SARS-CoV-2 Spike promotes but is not essential for infection and cell-cell fusion. PLOS Pathogens.
  • Stafanelli et al, 2020. Whole-genome and phylogenetic analysis of two SARS-CoV-2 strains isolated in Italy in January and February 2020: additional clues on multiple introductions and further circulation in Europe. Euro Surveill. 2020 Apr 2; 25(13): 2000305.
  • The New York Times. 2021. Coronavirus Variants and Mutations [online] Available at [Accessed 22 March 2020]
  • Van Dorp et al, 2020. Emergence of genomic diversity and recurrent mutations in SARS-CoV-2. Infection, Genetics & Evolution 83:104351
  • Wu & Zhao, 2021. Furin cleavage sites naturally occur in coronaviruses. Stem Cell Research, 50.

Further Reading

Last Updated: Mar 22, 2021

Dr. Osman Shabir

Written by

Dr. Osman Shabir

Osman is a Postdoctoral Research Associate at the University of Sheffield studying the impact of cardiovascular disease (atherosclerosis) on neurovascular function in vascular dementia and Alzheimer's disease using pre-clinical models and neuroimaging techniques. He is based in the Department of Infection, Immunity & Cardiovascular Disease in the Faculty of Medicine at Sheffield.


Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Shabir, Osman. (2021, March 22). The Phylogenetic Tree of the SARS-CoV-2 Virus. News-Medical. Retrieved on May 11, 2021 from

  • MLA

    Shabir, Osman. "The Phylogenetic Tree of the SARS-CoV-2 Virus". News-Medical. 11 May 2021. <>.

  • Chicago

    Shabir, Osman. "The Phylogenetic Tree of the SARS-CoV-2 Virus". News-Medical. (accessed May 11, 2021).

  • Harvard

    Shabir, Osman. 2021. The Phylogenetic Tree of the SARS-CoV-2 Virus. News-Medical, viewed 11 May 2021,


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.