Protein glycosylation is a post-translational/co-translational covalent attachment of glycans to the amino acid side chains of proteins. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of coronavirus disease 2019 (COVID-19), shows extensive glycosylation. In a review published in Signal Transduction and Targeted Therapy, researchers from Sichuan University have reviewed the available literature.
Study: The glycosylation in SARS-CoV-2 and its receptor ACE2. Image Credit: Kateryna Kon/ Shutterstock
Mass spectrometry (MS)-based N-glycoproteomics is the most common approach for site and structure-specific characterization of glycosylation. During the first stage of this process, the glycan is characterized and the intact glycopeptide, glycosite-containing peptide, and intact glycoprotein. Glycans normally need to be enriched by porous graphitized carbon (PGC) or other hydrophilic materials, while intact glycopeptides can be analyzed with or without enrichment. Glycopeptides with low stoichiometry benefit the most from enrichment. Hydrophilic interaction liquid chromatographic, lectin affinity chromatography, and graphitized carbon chromatography are normally used to enrich glycopeptides.
Before tandem MS analysis, chromatographic separation can also be helpful to simplify the composition of the analyte. PGC columns can be used to separate the hydrophilic underivatized native glycans, while permethylated glycans are more often separated using reversed-phase C18 chromatography. Glycosite-containing peptides and intact glycopeptides can be separated using several of the aforementioned techniques. Following separation, tandem MS/MS with various dissociation methods can analyze the glycans/glycoproteins.
The two largest issues in MS/MS analysis of both N- and O- glycosylation are glycosite location and glycan structure identification. N-glycosites can be localized from fragment ions from MS2 spectra, and structural isomers can be distinguished with fragment ions of the N-glycan moieties.
The SARS-CoV-2 genome encodes for four structural proteins, 16 non-structural proteins, and nine accessory factors. The majority of the encoded proteins are glycoproteins, but only four of these have a known glycosite. The spike protein is a trimeric transmembrane protein formed of the S1 and S2 subunits that must be cleaved by a host protease to be activated. The S1 subunit contains a receptor-binding domain (RBD) that binds to angiotensin-converting enzyme 2 (ACE2) to permit viral cell entry, while the S2 subunit is responsible for membrane fusion.
The modified glycans shield about 40% of the protein surface of the spike protein, acting as camouflage against the immune system. The spike protein is more exposed in SARS-CoV-2 than its nearest relatives - severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East Respiratory Syndrome (MERS). 23 N-linked glycosites with high occupancy have been identified, while only 2 O-linked glycosites are heavily occupied.
The E protein is a membrane protein that can form pentameric structures with cation-selective channel activity and Ca2+ conductivity in the ER-Golgi intermediate compartment. It is key for pathogenicity, viral assembly, and release. Two putative N-linked glycosites may exist in the transmembrane segment, but this is a sequence prediction.
The M protein is the most common envelope protein of SARS-CoV-2. It has three N-terminal transmembrane domains and interacts with the other three structural proteins. It is essential for the assembly of new virus particles. Little is known about glycosylation, but computer predictions suggest eight N-glycosites, with unknown functions.
ORF3a is a non-structural protein localized at the surface, with broad functions including enhancing viral entry, regulating pro-inflammatory cytokines and chemokine production, and participating in ion channel formation. It is also involved in the release of viral particles from the cell. There are likely four O-linked glycosites, two of which show significantly higher occupancy. The function of these is unknown, as there are no N-linked glycosites.
hACE2 - the receptor the S1 subunit binds to enter the cell, is expressed in a wide variety of tissues and organs. 7 N-glycosites have been identified, and 2 O-linked glycosites. The N-linked glycans are complex, with above 75% occupancy and sialic acid linkage - which is the attachment factor for several coronaviruses. The function of the O-linked glycosites is unsure.
The glycosylation of proteins can significantly alter their function, and a proper understanding of glycosylation in SARS-CoV-2 could help identify drug targets and help inform researchers. Many variants of concern show altered glycosylation, and as case numbers begin to rise again, this review helps illuminate the current state of our knowledge concerning this important feature.