In a recent study posted to the bioRxiv* preprint server, researchers surveilled coronaviruses (CoV) found in bats via targeted genomic sequencing.
Public health emergencies like severe acute respiratory syndrome (SARS), Middle East respiratory syndrome (MERS), and coronavirus disease 2019 (COVID-19) have highlighted the importance of monitoring zoonotic CoVs. Furthermore, these emergencies have underlined the lack of knowledge regarding phylogenetic resolution and several viral genetic factors.
About the study
In the present study, researchers estimated the efficiency of hybridization probe capture in enriching material related to the CoV genome in oral and rectal samples collected from bats.
The team obtained oral and rectal samples between August 2015 and June 2018 from bats on sale in the markets or were captured and subsequently released in the wild. These samples were collected from different locations in the Democratic Republic of the Congo (DRC). The different bat species were ascertained by ecologists via a polymerase chain reaction (PCR) that targeted the cytochrome B gene.
A custom panel that targeted familiar bat CoV diversity was designed using hybridization probes. The coverage of the reference sequences by the custom panels was examined in silico. Probe coverage was also evaluated for a subset of the reference sequences representing full-length genomic sequences. The team used the custom panel to evaluate the recovery of CoV genomic material via probe capture in 25 libraries of metagenomic sequences. These libraries were prepared from a retrospective collection of almost 21 oral and rectal swabs collected from bats in DRC from 2015 to 2018.
The custom probe panel was employed to capture the CoV genomic material from these metagenomic libraries of bat swabs. Genomic sequencing was subsequently performed. CoV recovery by the probes was assessed by assembling the captured sequencing reads de novo. The CoV sequences were later determined by aligning the contigs against CoV reference sequences. Assembly size metrics were also used to examine the degree to which the recovered contigs represented complete genomes.
The team also estimated the recovery of partial ribonucleic acid (RNA) dependent RNA polymerase (RdRp) amplicons. Furthermore, their probe coverage in silico was also assessed to demonstrate the targets covered by the custom panel. The researchers also evaluated the concentration and integrity of nucleic acids, which were two major aspects correlated to the successful preparation of genomic libraries. This was estimated as median RNA integrity number (RIN) values and RNA concentrations. These values were subsequently compared against the extent of reference sequences recovered from corresponding libraries.
The influence of blind spots on genomic recovery by the panel probes was examined in the genomic libraries. The team also evaluated the coverage of the probe with respect to reference sequences that were assigned to a particular phylogenetic group.
The study results showed that the team collected a total of 4,852 genomic sequences of bat CoVs to design a custom panel comprising 20,000 probe sequences. In this panel, 98.73% of the nucleotide positions in 90% of the target sequences were sufficiently covered. This indicated that the custom panel provided wide probe coverage of familiar bat CoVs. The team reported the recovery of 113 CoV contigs from 17 out of the 25 metagenomic sequencing libraries. The median size of the total contig assembly was 1,724 nucleotides, while the median N50 size of an assembly was 533 nucleotides.
Notably, four out of 25 libraries reported no recovery of CoV sequences, despite the generation of partial RdRp sequences from these libraries. Furthermore, the probe capture did not result in any whole CoV genomes, while many of the specimens had scattered and discontinuous coverage of the reference sequences. Also, 95.3% of the nucleotide positions in the partial RdRp amplicons were covered by the custom probe panel. Moreover, for 12 out of the 25 sequenced libraries, there was no recovery of any portion of the sequenced partial RdRp sequences, while seven of the 25 libraries had more than 95% of the partial RdRp sequences that were fully or almost fully recovered. This indicated that genomic recovery via the probe had limitations other than inclusivity of the probe panel.
The team observed weak monotonic associations of lower RIN and concentration values with worse genomic recovery. This correlation was especially significant for RNA concentrations but not for RNA integrity. These weak relationships indicated that additional factors were responsible for hindered genomic recovery, including low concentrations of viral materials or lack of probe coverage in genomic areas outside the region of the partial RdRp target.
A total of 92.3% of the reference sequences, such as CDAB0203R-PRE, CDAB0217R-PRE, and CDAB0492R-PRE, were recovered for the 25 libraries; however, complete viral spike genes were absent. This suggested the presence of CoVs similar to bat CoV CMR704-P12 and Chaerephon bat CoV/Kenya/KY22/2006, except with new spike genes that were different from the reference sequence spike genes.
Overall, the study findings showed the potential of probe capture of CoV genomic material in accurately assessing a wider range of viruses.
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.