The current COVID-19 pandemic is characterized by unpredictable clinical phenotypes, with the majority of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections being asymptomatic or mild. However, in a subgroup of patients, the infection leads to the death of over a quarter of those affected. A recent study by researchers from Semmelweis University and Pazmany Peter Catholic University in Budapest, Hungary, published on the preprint server medRxiv* in October 2020, shows the relationship of specific mutations on the disease's outcome.
The virus's genome is a large one, about 30 kb in length, with 25 genes. Phylogenetic analysis reveals three variants, A, B, and C, distributed differently in Asia, Europe, or the American continents. The genes encoded in this genome include the envelope protein, an RNA dependent RNA polymerase (RdRp), a spike glycoprotein, and the membrane glycoprotein.
Mutations and Functional Effects
Mutations in any of the structural and functional genes may impact the virus's characteristics, including its virulence. Mutations in the untranslated genomic regions may also have significant effects. As new SARS-CoV-2 mutations in the virus continue to emerge, their functional impact is being studied.
For instance, some mutations lead to variation in the RdRp enzyme, while others increase the transmissibility. The latter type of mutations may enhance the survival advantage of the strain, allowing it to become the dominant strain following its introduction into an area, as seen with the spike D614G mutant, which has largely replaced the original strain in most areas.
Mutations and Patient Outcomes
The current study focused on identifying those viral mutations that were associated with different outcomes in the patient. For instance, if a mutation reduced the virus's virulence, the resulting mild infection might allow the virus to spread widely. At the same time, those which result in death may cause intensified attention to virus containment, resulting in the fadeout of the outbreak.
To achieve this, the researchers linked the mutations to all the outcomes across a large patient cohort. Of the over 72,000 complete sequences available, clinical data were available in only just over 5,000 sequences and follow-up data in ~3,200 patients. The limited proportion of sequences included in this study may have caused sampling bias, say the researchers.
The majority of samples came from Asia, while ~27% were from Europe, ~9% were from Central America, ~6-7% from the Americas, and ~5% from Africa. The severity break-up was as follows: 625 with mild, ~2,300 moderate, and ~220 severe.
There were about 2,100 mutations in all, of which 463 were not represented in the clinical samples. They estimated that each sample has, on average, 2.8 mutations. The average sample size for the wild-type virus with a mild outcome was 623, while it was 2.336 and 217 for wild-type virus and hospitalized vs. severe outcome, respectively.
Mutations Related to Outcome
The researchers found 141 mutations, which had a significant correlation with the clinical outcome.
Mutations related to mild disease
Looking only at mutations observed in 2% or more of the samples, they found 64 samples correlated to 6 mutations in the ORF8, ORF3a, nsp4, nsp6, and the L and N proteins.
Mutations associated with severe disease
In samples from patients with moderate to severe disease, they found 9 mutations related to seven genes, including the D614G and L54F in the spike protein, one in the RdRp, and others in other structural and non-structural proteins.
They also explored all mutations that were present in 10 or more severely ill patients. This showed two more mutations in the spike protein and the nsp7 gene, present in 28 and 11 severely ill patients.
Overall, the prevalence of mutations correlated with a mild outcome was lower than with severe outcomes, at ~1,500 vs. 6,700 mutations. There were over 5,000 mutations that were not linked to any clinical outcome.
The N protein and Mutational Significance
The researchers found that of the 17 mutations thought to be significant, the greatest number (5) of them were in the nucleocapsid phosphoprotein, which was associated with both types of outcomes. Several mutations in the N protein were linked to a mild and severe outcome, respectively. One of them increased the outcome severity from a 76% chance of a mild outcome to less than 1%.
Most of the mutations were closely linked in position, being mostly found in a small region of the phosphoprotein mutations between positions 194 and 204. This region is phosphorylated and is located in a serine-rich region of the protein.
Phosphorylation activates host RNA helicase DDX1, thus enabling the production of longer fragments of subgenomic mRNA.
The discovery of the role of phosphorylation in the N protein could be useful in designing drugs against it at these sites. It also shows that this approach could help identify more important mutations in the genome of the virus.
The structural proteins were found to be more prone to mutations than the non-structural proteins. The destabilization associated with some non-structural protein mutations might have led to the divergence of SARS-CoV-2 from the SARS-CoV lineage.
Another important finding is that mutations in the SARS-CoV-2 genome can shift the virus towards either greater or lesser virulence in the future, which justifies keeping a watch over the rate, location, and effect of mutations.
medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.