Researchers in Sweden and China have presented a new methodology for predicting mutation “hotspot” sites within severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) - the agent responsible for the current coronavirus disease 2019 (COVID-19) pandemic.
Within these hotspots, a mutation would cause a small local change within the SARS-CoV-2 spike protein structure – the surface structure the virus uses to bind to and infect host cells.
This small change in the spike protein’s structure would result in a large conformational change in the protein that would alter its biological function.
As a proof-of-concept exercise, the team first showed how the notorious spike variant D614G could have been predicted using the new methodology. This variant has enhanced the infectivity of the virus and is currently the most prevalent strain globally.
The team also presents other examples of potential hotspot residues that would be strong candidates for mutation.
The researchers say the methodology could be used to design effective drugs and antibodies against the spike protein and, more generally, to search for and identify mutation hotspots.
A pre-print version of the paper is available on the server bioRxiv*, while the article undergoes peer review.
(Color online) Spike protein subunit S1 consists of the N-terminal domain NTD and the receptor-binding domain RBD. The subunit S2 consists of the fusion peptide FP, two heptapeptide repeat sequences HR1 and HR2, the transmembrane domain TM and the cytoplasm domain tail CT.
The spike protein is a major research focus
Given its role in binding to and infecting host cells, the transmembrane spike glycoprotein that coats the surface of SARS-CoV-2 is of particular interest to researchers.
The protein is made up of two subunits. Subunit 1 (S1) starts with an N-terminal domain (NTD; residues 14 to 305) that contributes to conformational changes in the protein during interaction with a host cell.
The NTD is followed by the receptor-binding domain (RBD; residues 319 to 541), which initiates host cell entry by binding to the host cell receptor angiotensin-converting enzyme 2 (ACE2).
The RBD is followed by a junction region that lies between S1 and subunit 2 (S2). Cleavage sites within this segment are hydrolyzed by host cell proteases that prime the spike protein for fusion with the cell membrane.
Subunit 2 (S2) makes up the remainder of the protein, starting with a fusion peptide spanning residues 788 to 806 that initiate fusion. This is followed by two heptapeptide repeat (HR) sequences, HR1 at residues 912-984 and HR2 at residues 1163-121. These HR sequences form a six-helical bundle that enables the virus to fuse with and enter the cell.
Many efforts to develop vaccines and therapies have focused on the RBD and the HR1/HR2 domain.
“But for a durable antiviral one needs to know, how to identify those amino acids that are prone for a mutation and how to predict the biological consequences of any potential mutation,” says Antti Juhani - Niemi from Stockholm University and Xubiao Peng from the Beijing Institute of Technology.
“Since conformation is pivotal for a protein’s function, the knowledge of hotspots is important for a better understanding of how the spike protein operates and how it can be incapacitated.”
What did the researchers do?
Now, the team has presented a methodology that can help predict sites along the Cα backbone of the spike, where a change in the local topology can result in a large conformational change that alters the protein’s activity.
Such a change in the topology of the backbone can only occur when a structural bifurcation takes place at a critical site. A pre-requisite for this bifurcation is the presence of a flattening point.
Therefore, the methodology involves identifying potential mutation hotspots by charting all sites that are proximal to a flattening point.
(Color online) a) The present all-atom structure of the spike protein in the neighborhood of site 103. b) The substitution of A in place of 103G, as predicted by Chimera . There are no apparent steric hinders for the substitution.
The potential outcome of a mutation was predicted by comparing the residues with those present in best-matching high-resolution Protein Data Bank structures.
As a proof-of-concept, the team applied the methodology to the notorious D614G mutation site. This site was shown to be proximal to a flattening point, and the D→G substitution was correctly predicted.
Several topologically similar hotspot sites were identified in the NTD and the fusion core that forms the junction between HR1 and HR2.
Hotspot sites within the NTD were found to be good candidates from mutation, whereas those in the fusion core were more stable.
The researchers say that, in particular, residue 1080A in the fusion core was identified as stable against mutation but appeared to be prone to a change in local topology due to its proximity to a bi-flattening point.
“This can make the residue 1080A into a good target for the development of structure-based fusion inhibitors,” suggests the team.
What are the implications of the study?
The researchers say identifying and analyzing these mutation hotspot sites in the spike protein can help predict its future evolution and help develop structure-based therapeutic drugs and vaccines.
“The methodology can be used to design effective drugs and antibodies against the spike protein. It can also be employed more generally, whenever one needs to search for and identify mutation hotspots in a protein,” they write.
“Since conformation is pivotal for a protein’s function, topology should be a most effective tool also in protein research,” concludes the team.
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.