In a recent study posted to the bioRxiv* pre-print server, a team of researchers investigated the existence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-host chimeric RNA and described two novel human-derived genomic insertions present in the circulating variants of SARS-CoV-2.
There is evidence that insertions in the SARS-CoV-2 genome have the potential to give rise to new variants with enhanced infectivity, pathogenicity, and antibody escape. Although the source of these insertions is unknown, some recent studies have suggested that human RNAs could be a potent source of some of these insertions but not with certainty. Researchers have speculated that human-derived insertions in the SARS-CoV-2 genome are generated through RNA-dependent RNA polymerase (RdRp)-driven template switching events, but this too has never been supported with evidence.
About the study
In the present study, researchers analyzed publicly available direct RNA sequencing data from SARS-CoV-2 infected cells to demonstrate that the RdRp-driven template switching between SARS-CoV-2 and host RNA is infrequent and stochastic.
By analyzing the publicly available global initiative on sharing avian flu data (GISAID) SARS-CoV-2 genome collection, the researchers also identified two genomic insertions in circulating SARS-CoV-2 variants that most likely originated from the host (humans) 18S and 28S ribosomal RNA (rRNAs).
The publicly available Nanopore direct RNA-sequencing data were quality filtered and mapped to the host and SARS-CoV-2 transcriptomes to identify potential chimeric sequences. From a total of 30 samples that were analyzed, 16 had host-viral chimeric reads with an average of 0.027% (standard deviation 0.045%) of the reads mapped to SARS-CoV-2 being chimeric.
One sample had 0.207% of chimeric reads, while the other 15 samples had less than 0.06% total chimeric reads suggesting that chimeric reads were typically rare. Compared to in vivo conditions, even these rates could be overestimating chimeric reads due to the cell lines used in the analysis.
Further, the researchers investigated how the viral and host RNA sequences were joined in the chimeric reads. They observed that the chimeric sequences were not an equal mix of positive and negative sense sequences, and they contained 92.24% host-derived positive-sense sequences. Highly expressed host genes and structural RNA genes likely have a higher chance to be observed in chimeric RNA reads. These findings thus suggested that the chimera formation process was stochastic with no preference for starting with host or viral sequences, and highly expressed genes formed chimeric sequences at a higher frequency.
In the chimeric reads, the viral-derived sequences were annotated positive-sense RNA and host-derived sequences negative-sense RNA. Further analysis showed that the few host-derived reads annotated as negative-sense RNA were long non-coding RNAs found in the raw reads. it is thus unlikely that these negative-sense sequences were mis-annotated rather than being derived from negative-sense RNA and it further validated the structure of host-viral chimeric sequences. Based on these findings, the researchers concluded that these host-viral chimeric sequences were created from positive-to-positive-strand template switching events.
The study results showed that the formation of host-viral chimeric mRNAs could be transient events with no long-term impact on viral fitness. In addition, although not widespread, host-derived 18S and 28S rRNA insertions were identified in some circulating variants of the SARS-CoV-2, indicating that human genetic material could be a source of genomic insertions in SARS-CoV-2. Notably, rRNAs have been a source of insertions in influenza genomes, in some cases resulting in significantly more pathogenic viral variants.
The study demonstrated that viral-host chimeric sequences formed through stochastic RdRp template switching events. The mechanisms at work in these events were unclear and warranted further work, considering the potential importance of these processes in viral evolution and the emergence of new variants.
It is known that SARS-CoV-2 forms double-membrane vesicles utilizing host rRNAs during its replication as these molecules are abundant in the host cells. The authors speculated that a similar phenomenon could be involved in human rRNA-derived insertions in the SARS-CoV-2 genome, but the formation of double-membrane vesicles by SARS-CoV-2 complicated this process. In the absence of evidence, further investigation of this phenomenon is warranted in future studies.
Overall, the study results supported the hypothesis that some SARS-CoV-2 insertions are derived from human genetic material, highlighting the potential importance of host-derived insertions in viral evolution.
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.