Researchers in the United States have shown that genes from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) – the causative agent of coronavirus disease 2019 (COVID-19) – can be integrated into the genome of infected human cells.
The team says the viral RNA can be expressed as chimeric transcripts with fused cellular and viral sequences.
“Importantly, such chimeric transcripts are detected in patient-derived tissues,” writes the team from the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts and the National Cancer Institute in Frederick, Maryland.
Rudolf Jaenisch and colleagues say the findings may help to explain why some patients who have recovered from SARS-CoV-2 infection still test positive for the virus months later.
Patients remaining positive for viral RNA is an unresolved issue
Continuous or recurrent SARS-CoV-2-positive tests by polymerase chain reaction (PCR) have been reported in patients weeks or months after they have recovered from COVID-19. However, no infectious virus was isolated or shed from these patients, and the cause of the continued viral RNA production remains unknown.
Like other beta coronaviruses, SARS-CoV-2 uses an RNA-dependent RNA polymerase to replicate its genomic RNA and to transcribe subgenomic RNAs.
One potential explanation for the recurrent detection of viral RNA in the absence of viral replication is that DNA copies of viral subgenomic RNAs may become integrated into the DNA of the host cell via reverse transcription.
Transcription of the integrated DNA copies could be responsible for positive PCR tests long after the initial infection was cleared,” writes Jaenisch and colleagues.
Indeed, nonretroviral RNA virus sequences have been detected in the genomes of many animals. Several integrations in these sequences exhibit signals that are consistent with the integration of DNA copies of viral mRNAs into the germline via long interspersed nuclear element (LINE) retrotransposons.
Furthermore, nonretroviral RNA viruses such as lymphocytic choriomeningitis virus (LCMV) can be reverse transcribed into DNA copies by an endogenous reverse transcriptase. Studies have also shown that DNA copies of the viral sequences can integrate into the DNA of host cells.
“Moreover, expression of endogenous LINE1 and other retrotransposons in host cells is commonly up-regulated upon viral infection, including SARS-CoV-2 infection,” says Jaenisch and the team.
What did the researchers do?
The team used three different approaches to investigate whether SARS-CoV-2 RNA can be reverse transcribed and integrated into the genome of infected human cells in culture. The three approaches used to detect the viral RNA were nanopore long-read sequencing, Illumina paired-end whole genomic sequencing, and Tn5 tagmentation-based DNA integration site enrichment sequencing.
As reported in the Proceedings of the National Academy of Sciences, all three approaches provided evidence that SARS-CoV-2 RNA can be integrated into the genome of the host cell.
DNA copies of SARS-CoV-2 sequences were present in the genome and were shown to be integrated via a LINE1-mediated retroposition mechanism.
SARS-CoV-2 RNA can be reverse transcribed and integrated into the host cell genome. (A) Experimental workflow. (B) Chimeric sequence from a Nanopore sequencing read showing integration of a full-length SARS-CoV-2 NC subgenomic RNA sequence (magenta) and human genomic sequences (blue) flanking both sides of the integrated viral sequence. Features indicative of LINE1-mediated “target-primed reverse transcription” include the target site duplication (yellow highlight) and the LINE1 endonuclease recognition sequence (underlined). Sequences that could be mapped to both genomes are shown in purple with mismatches to the human genomic sequences in italics. The arrows indicate sequence orientation with regard to the human and SARS-CoV-2 genomes as shown in C and D. (C) Alignment of the Nanopore read in B with the human genome (chromosome X) showing the integration site. The human sequences at the junction region show the target site, which was duplicated when the SARS-CoV-2 cDNA was integrated (yellow highlight) and the LINE1 endonuclease recognition sequence (underlined). (D) Alignment of the Nanopore read in B with the SARS-CoV-2 genome showing the integrated viral DNA is a copy of the full-length NC subgenomic RNA. The light blue highlighted regions are enlarged to show TRS-L (I) and TRS-B (II) sequences (underlined, these are the sequences where the viral polymerase jumps to generate the subgenomic RNA) and the end of the viral sequence at the poly(A) tail (III). These viral sequence features (I–III) show that a DNA copy of the full-length NC subgenomic RNA was retro-integrated. (E) A human–viral chimeric read pair from Illumina paired-end whole-genome sequencing. The read pair is shown with alignment to the human (blue) and SARS-CoV-2 (magenta) genomes. The arrows indicate the read orientations relative to the human and SARS-CoV-2 genomes. The highlighted (light blue) region of the human read mapping is enlarged to show the LINE1 recognition sequence (underlined). (F) Distributions of human–CoV2 chimeric junctions from Nanopore (Left) and Illumina (Right) sequencing with regard to features of the human genome.
In some tissue samples taken from patients, the team also found evidence suggesting that a large proportion of the viral sequences were transcribed from integrated DNA copies of viral sequences, generating viral–host chimeric transcripts.
These and other data are consistent with a target primed reverse transcription and retroposition integration mechanism and suggest that endogenous LINE1 reverse transcriptase can be involved in the reverse transcription and integration of SARS-CoV-2 sequences in the genomes of infected cells,” writes the team.
However, approximately 30% of the viral integrants lacked a recognizable nearby LINE1 endonuclease recognition site, thereby indicating that the integration could also occur via another mechanism.
What are the implications of the findings?
Jaenisch and colleagues say the findings raise several questions that require further investigation.
For example, the researchers ask whether integrated SARS-CoV-2 sequences express viral antigens in patients and whether these might influence the clinical course of disease.
If a cell with an integrated and expressed SARS-CoV-2 sequence survives and presents a viral- or neoantigen after the infection is cleared, this might engender continuous stimulation of immunity without producing infectious virus and could trigger a protective response or conditions such as autoimmunity as has been observed in some patients,” they write.
More generally, the integration of viral DNA in somatic cells may represent a consequence of natural infection that could play a role in the effects of other common disease-causing RNA viruses such as dengue and influenza virus, says the team.
The results may also be relevant for clinical trials of antiviral therapies.
If integration and expression of viral RNA are fairly common, reliance on extremely sensitive PCR tests to determine the effect of treatments on viral replication and viral load may not always reflect the ability of the treatment to fully suppress viral replication because the PCR assays may detect viral transcripts that derive from viral DNA sequences that have been stably integrated into the genome rather than infectious virus,” says Jaenisch and colleagues.