The current pandemic of coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is still spreading rapidly throughout the world. A new preprint on the bioRxiv* server describes the structure of an important viral non-structural protein NSP12-16 and its role in SARS-CoV-2 infection.
The viral RTC
NSP12-16 is part of a set of NSPs that are highly conserved across coronaviruses (CoVs) and have been demonstrated to be essential for viral RNA synthesis. Viral replication is associated with the formation of a replication and transcription complex (RTC) in CoVs, from NSP1-16. The RTC synthesizes genomic RNA (gRNA) and subgenomic RNAs (sgRNAs) to replicate the viral genome and transcription of viral protein-encoding genes, respectively.
Much remains to be understood about the structures comprising this complex, though a few local and single structures have been described.
Replication and transcription in CoVs occur via a "leader-to-body fusion" model. However, this needs to be correlated with the structure of the RTC of SARS-CoV-2. The available knowledge does not explain this.
An earlier study by the same team identified NSP15 cleavage sites and explained how this model works, using negative feedback principles to model the regulation of CoV replication and transcription. The current study aims to explain how NSP12-16 are arranged on theoretical grounds, in the context of the structure of the whole RTC, and to determine how this model explains its function.
Leader-to-body fusion model
The leader-to-body fusion model begins with the gRNAs acting as templates for the synthesis of the antisense gRNA strand as well as antisense sgRNAs, via RNA-dependent RNA polymerase (RdRp) activity. Transcription regulatory sequences (TRS) are found on both the leader and the body.
As RdRp crosses a body TRS (TRS-B), it pauses, and the template also shifts to the leader TRS (TRS-L). This leads to discontinuous transcription forming sgRNAs. If not, a continuous synthesis of gRNA occurs.
The newly synthesized negative sense gRNAs and sgRNAs are the templates for the synthesis of positive-sense gRNAs and sgRNAs. These are used to translate NSP16, the four viral structural proteins, and the open reading frames ORF3a, 6, 7a, 7b, 8 and 10. The structural proteins comprise the spike (S), envelope (E), membrane (M) and nucleocapsid (N).
At a molecular level, the leader-to-body fusion model operates as follows: the NSP15 cleaves the negative sense RNAs at the TRS-B loci, followed by hybridization of the junction regions of the TRS-L sequences by the 3' ends of the TRS-B sequences. This allows template switching.
Conversely, NSP15 can also cleave these RNAs at the TRS-L sequences, but this is not relevant to template switching.
The function of the RTC during this process is explained by this hypothesis. However, it is unclear how the other NSPs are associated during this synthesis.
TRS hairpins can be used to identify recombination regions in CoV genomes
The NSP15 cleavage sites have the TRS motif on the antisense strands, while RNA methylation sites have AAGAA-like motifs all over the virus's genome. Using direct RNA sequencing, they found that in eight genes, both these motifs occurred together at the TRS-Bs.
The genes involved were the four structural genes, with ORF3a, 6, 7a and 8. Many hairpin sequences are also thought to be part of these TRS-Bs, encoded by complemented small palindromic small RNA (cpsRNA) sequences 14-31 nucleotides long.
In this study, the researchers found that NSP15 has an unusual breakpoint on the canonical TRS hairpin of ORF3a. This non-canonical breakpoint reflects structure-based cleavage rather than cleavage based on the occurrence of particular sequences as supposed earlier.
Such non-canonical TRS hairpins were also found in non-canonical TRS junctions. This led them to postulate that the cleavage of such hairpins was responsible for recombination events in coronaviruses. They confirmed that non-canonical TRS hairpins were present in seven recombination regions in the ORF1a, S and ORF8 genes in almost 300 betacoronavirus B subgroup genomes. This is in addition to the hairpins found in five typical recombination regions in an earlier study by the same researchers.
In short, TRS hairpins are markers of recombination regions in CoV genomes.
Associations of RNA methylation, NSP15 cleavage and polyadenylation
The study also showed that the AAGAA-like motif, found in association with the 3' poly-adenine lengths on both types of RNA, could stabilize newly forming RNAs. This is because these are methylation sites. However, the exact link between this and the shortening of the poly (A) length on modified RNAs of SARS-CoV-2 is unclear.
However, it seems that the RTC may have a local structure, where NSP15 and NSP16 associate, and this may include a TRS hairpin. Thus, this can promote NSP15 cleavage and methylation of the RNA of the TRS hairpin at the opposite sides.
The functioning of the RTC in leader-to-body fusion
The researchers came up with a novel explanation. Since there are three types of TRS-B hairpins in the betacoronavirus B subgroup, they postulated that RNA methylation in CoVs affects hairpin formation and thus affects the secondary RNA structure.
If the sequences flanking the AAGAA-like motif are methylated, the NSP15 cleavage sites will be in the loops of the second TRS-B hairpin class. This makes this class the most suitable choice for both these motifs and for NSP15 cleavage sites. The latter is exposed in a small loop, allowing NSP15 to make contact with it.
This explanation also supports earlier mutation experiments showing that NSP15 cleavage sites are recognized not by sequence, being independent of the TRS motif, but by structure, depending on the adjacent sequences.
Based on information from many studies, there are some possibly connected bits of data. One, NSP15 cleavage sites were identified. TRS hairpins were present on eight genes in a highly conserved manner. NSP15 cleavage is linked to RNA methylation and 3' polyadenylation. ORF1b is more significantly conserved than ORF1b, the first not having recombination regions. And finally, there is a very high ratio of sense reads to antisense reads.
The proposed sequence of events begins with RNA synthesis by NSP12, with NSP14 proofreading the process. Methylation of the newly forming single-stranded RNA occurs, with TRS cleavage by NSP15 in certain conditions only, and these remain to be understood.
NSP7 and NSP8 are cofactors of NSP12, to assemble the RTC. NSP8 probably interacts with the hexameric NSP15 as part of the RTC structure.
This theoretical construct of the part occupied by NSP12-16 in the whole RTC was meant to describe the functioning of the CoV RTC during the leader-to-body fusion process, showing the linkages between various RTC functions like RNA synthesis, NSP15 cleavage, RNA methylation and replication/transcription.
The structures of the various components should be uncovered by future research, as this will help understand the global RTC better, with important therapeutic implications.
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.