A team of scientists at the Defence Research and Development Establishment, India, reveals that there are multiple viral introduction events in central India, and SARS-CoV-2 with D614G mutation is the prevalently circulating strain.
Since the emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in China in late December 2019, many viral epidemiological studies have been performed to investigate the incidence and spread of the virus. Because of the highly transmissible nature of deadly SARS-CoV-2, the causative pathogen of coronavirus disease 2019 (COVID-19) pandemic, the number of infected people as well as the death toll is exponentially increasing worldwide. Extensive sequencing of the viral genome is an essential step toward understanding the viral evolution and its transmission dynamics, as well as to contain the viral spread and identify therapeutic interventions.
Colorized scanning electron micrograph of a cell showing morphological signs of apoptosis, infected with SARS-COV-2 virus particles (orange), isolated from a patient sample. Image captured at the NIAID Integrated Research Facility (IRF) in Fort Detrick, Maryland. Credit: NIAID/NIH
Current study design
In the current study published on the preprint server bioRxiv*, the scientists studied 5,000 suspected COVID-19 cases to investigate the virus introduction events and its spread in central India. The patient samples, along with detailed patient history, were collected from 10 different districts. To identify confirmed COVID-19 cases, they performed a reverse transcriptase quantitative polymerase chain reaction, which led to identifying 136 SARS-CoV-2 positive cases.
Of 136 cases, 26 were selected for the whole genome sequencing analysis based on patient’s travel history, age, and contact history; patient’s entry links to the study region from outside; index cases (first identified cases), and death cases. Whole-genome sequencing using the Oxford nanopore platform gave rise to consensus genome sequences of representative SARS-CoV-2 that were circulating in 10 different districts in central India.
Understanding the emergence and expansion of SARS-CoV-2 in central India
Analysis of patients’ characteristics revealed that the number of infected cases was maximum in the age range of 21 – 30 years, and the most common symptoms were fever, breathlessness, and sore throat. In the case of immunocompromised patients, death occurred within 24 – 48 hours of hospitalization.
By comparing the whole genome sequences of experimental SARS-CoV-2 strains with the sequences of globally circulating SARS-CoV-2 strains, the scientists identified 38 amino acid substitution mutations compared to the Wuhan strain. Of these substitutions, the majority (n=24) were observed in ORF1 ab protein, followed by spike protein (n=5), nucleocapsid (n=4), ORF 3a (n=2), and envelope protein, membrane protein, and ORF 7a (n=1 for each protein).
Given the importance of spike protein in SARS-CoV-2 transmission, the scientists thoroughly analyzed the amino acid substitution observed in the spike protein. They identified that 17 viral strains from 4 districts had D614G mutation, making the virus more infectious. Other substitution mutations they identified in the experimental strains included E583D, S884F, S929T, and S943P.
Notably, the scientists identified nonsynonymous substitutions in experimental viral strains. These substitution mutations were A97V in RNA-dependent RNA polymerase, P13L in nucleocapsid protein, and T2016K in non-structural protein 3. These mutations are known to change protein functions, and thus, can be potentially associated with the rapidly expanding COVID-19 pandemic.
The scientists next performed phylogenetic analysis using SARS-CoV-2 whole-genome sequence data available on the Global Initiative on Sharing All Influenza Data (GISAID) database and observed that viruses that were circulating globally between December 2019 and May 2020 belong to A1 – A4 and B clades (Clade: a monophyletic group with a common ancestor and all its lineal descendants).
In India, circulating SARS-CoV-2 belongs to several evolutionary clades, including A2a, A3, A4, and B. However, three clades (A2a, A4, and B) were identified for the viruses analyzed in the current study; of which, the majority were clustered in A2a clade.
The current study reveals that the exponential rise of SARS-CoV-2 positive cases in central India is associated with multiple viral introduction events with diverse geographical linkage, as well as viral expansion within the regions. The phylogenetic data reveal links of viral introduction from Italy, the UK, France, and Southeast and Central Asia. The evolution of the virus in the studied regions is evidenced by the cluster-wise segregation of SARS-CoV-2.
The identification of D614G mutation as the prevalent one suggests that the highly infectious strain is circulating in central India. A growing pool of evidence suggests that D614G mutation increases the transmission potency of SARS-CoV-2 and is associated with higher viral load and mortality in COVID-19 patients.
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.