In a recent study posted to the bioRxiv* preprint server, researchers investigated whether severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) originated naturally from an animal-to-human spillover or synthetically during coronavirus (CoV) research experiments.
Researchers often utilize the in vitro genome assembly (IVGA) method to synthetically construct CoV variants in laboratories. The technique involves restriction enzymes for generating deoxyribonucleic acid (DNA) building blocks that can then be assembled orderly to create the viral genome.
For the synthetic generation of viruses, the viral genome is engineered to eliminate or incorporate stitching regions known as restriction sites (RS). RS modifications could serve as IVGA fingerprints, and to prevent pandemics in the future, an understanding of SARS-CoV-2 origin is critical.
About the study
In the present study, researchers investigated whether the origin of SARS-CoV-2 was natural or synthetic using a common IVGA method for ribonucleic acid (RNA) virus infectious clones.
The full-length RNA genome was reconstructed in DNA by IVGA for the creation of infectious CoV versions. Two distinctive endonuclease enzymes were used for assembling the viral genome, with two sites of an enzyme flanking the site of interest to enable efficient manipulation of the flanking region without reassembling the whole viral backbone for every variant.
The team computed random RS distributions that could be anticipated in unmodified viruses upon digestion of a broad spectrum of natural CoV genomes with a comprehensive list of restriction endonuclease enzymes. MERS-CoV (Middle East respiratory syndrome CoV) reverse genetic system was engineered for IVGA by removing the existent BglI sites, inserting six additional BglI sites, and evenly spacing type IIS RS.
Additions and removals were performed through synonymous-type mutations, creating seven fragments, of which the longest was 5721 base pair (bp) long, or covering 19% of the MERS genome. The analysis focused on SARS-CoV-2 BsaI/BsmBI sites compared to RS maps of all CoVs. Mutation analysis was performed by generating 100,000 random in silico mutants for RaTG13 and BANAl-20-52. Phylogenetic inference analysis was performed with pairwise whole-genome alignments between RaTG13 and SARS-CoV-2 and BANAL-20-52 and SARS-CoV-2 digested by BsaI/BsmBI.
Github repository data of engineered CoVs were used for the analysis, and CoV spike (S) gene open reading frames (ORFs) were obtained from the NCBI Gene database. Google Scholar was searched to create a completely representative list of examples of infectious CoV clones limited between 2000 and 2019.
The synthetic SARS-CoV-2 fingerprint was anomalous among wild-type (wt) CoVs and common among lab-assembled viral organisms. SARS-CoV-2 RS map was consistent with those reported previously for synthetic CoV genomes, satisfied all criteria essential for an efficient reverse genetic system, significantly differed from its closest relatives by a high rate of synonymous but in the synthetic-appearing identification sites, and the fingerprint was unlikely to have evolved from its close relatives.
In the RS map analysis, the IVGA SARS-CoV-2 fingerprint showed (i) incorporation and/or elimination of unique endonuclease enzymes (BsmBI, BglI and BsaI); (ii) digestion by selected enzymes resulting in five to eight fragments; (iii) the largest fragment being <eight kb; (iv) all the sticky ends were unique; (v) all the identification sites were created via synonymous mutations; (vi) two unique identification sites could flank sites for further manipulation; (vii) identification sites could be aligned with other viral organisms for enabling segment substitutions.
The pattern of silent or synonymous patterns and restriction site fingerprint was almost universal for synthetic viral organisms, indicating strongly that SARS-CoV-2 originated synthetically. The SARS-CoV-2 RS map contained five BsaI/BsmBI RS. It contrasted with the closely related viruses by being evenly spaced and lacking two highly conserved BsaI RS found in nearly all other B lineage sarbecoviruses. The map was an outlier in the bottom 1.0% of the longest fragment lengths of non-engineered CoVs.
The restriction map analysis of 1491 RS maps in the IIS wt distribution showed that SARS-CoV-2 was more SD (standard deviations) under the average compared to any viral organism that is non-engineered, indicating <0.1% probability of such an anomalous RS map in a wt, non-engineered virus. SARS-CoV-2 showed six fragments being doubly digested by type IIS restriction endonuclease enzymes, and the RaTG13 and BANAL 52 close relatives showed seven and five fragments with distinctive RS, respectively.
There were 12 silent mutations observed in nine distinct BsaI/BsmBI RS between SARS-CoV-2 and RaTG13 and five silent mutations in seven distinctive BsaI/BsmBI RS between SARS-CoV-2 and BANAL52. A mere one percent and 0.1% of RaTG13 and BANAL52 mutants showed BsaI/BsmBI RS maps with greater z-scores compared to SARS-CoV-2, respectively.
Overall, the study findings showed a high probability of SARS-CoV-2 originating synthetically as an infectious clone assembled in vitro. The findings could aid policy-making and research to prevent future pandemics and encourage biosafety improvements worldwide.
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.