The coronavirus disease 2019 (COVID-19) pandemic is caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). This pandemic has indicated the effectiveness of real-time analysis of sequencing data linked to a wide range of databases.
Study: Real-time monitoring and analysis of SARS-CoV-2 nanopore sequencing with minoTour. Image Credit: Design_Cells/ Shutterstock
Scientists have proposed that real-time sequencing could be a crucial component for pathogen surveillance during disease outbreaks. During this pandemic, new tools have been developed, and existing approaches have been improved for analyzing sequence data at local and international levels. Scientists can easily use portable sequencers (e.g., sequencers developed by Oxford Nanopore) in the field, and hence, samples can be sequenced anywhere. These tools can describe the lineage of the sequence and rapidly classify them as Variants of Concern (VoC) or Variants under Investigation (VuI).
Development and Use of MinoTour to Analyze SARS-CoV-2 Sequences
A study published on the bioRxiv* preprint sever focused on Oxford Nanopore Technologies (ONT) sequencers that can rapidly provide data on the SARS-CoV-2 virus sequence. The authors of this study exploited the fact that various viral sequencing libraries are available that can quickly produce appropriate data for the majority of the samples. Researchers believe that this process can be accelerated by incrementally analyzing reads as they are created.
A preprint version of the study is available on the bioRxiv* server while the article undergoes peer review.
ONT comprises a range of sequencers, namely, MinION, GridION, and Promethion that transform sequencing from a fixed to real-time process. This feature is unique to ONT sequencers that the authors of this study have studied. Scientists believe that data analysis can be initiated earlier if the sequenced data is available immediately after DNA finishes translocating the pore. This would significantly reduce the total time required to answer a specific question regarding the lineage of the virus or its classification. The COVID-19 Genomics UK (COG-UK) members have generated thousands of SARS-CoV-2 consensus sequences using ONT sequencers.
In this study, researchers used minoTour, a real-time analysis and monitoring system, to assess the performance of each sequencing run in the system. For individual viral genomes, ONT can provide more data than required. Hence, researchers believe it is appropriate to stop sequencing once enough data is obtained for the analysis. Shorter sequence runs could preserve flow cell health, which can be reused for other experiments and sequencing libraries. This implies that proper implementation could lower the cost per sample sequenced. Hence, researchers integrated these properties into the minoTour.
The ARTIC Network delivers complete protocols for the best practice informatic analysis of SARS-CoV-2. Typically, the ARTIC pipeline uses nanopolish for signal level analysis of nanopore data during variant calling. However, it also provides medaka as an alternative to nanopolish. Medaka is a machine learning pipeline which only needs FASTQ data.
Since signal level data are not available within minoTour, researchers have incorporated ARTIC medaka workflow, which enabled the real-time production of consensus genomes along with sequence data generation. This process is unique, and to date, none of the web-based analysis platforms have adopted the real-time features of the nanopore platform and linked it with the sequenced data for further study.
The main advantage of minoTour is that it can alert the user through the Twitter API that a sequencing run can be terminated as sufficient data have been obtained. The users can analyze the detailed breakdowns and visualizations of the ongoing sequencing run using minoTours. This would help them to make an informed decision about the termination of the sequence run. This step also could be automated; however, researchers have not implemented this in their study as specific samples could be important for the users.
In this study, researchers analyzed the impact of stopping the sequence runs early. This was studied by comparing the consensus sequences for 454 SARS-CoV-2 samples assembled by both the medaka and nanopolish pipelines. These sequences were used to determine Phylogenetic Assignment of Named Global Outbreak (PANGO) lineage assignment, variant classification, and SNP calls for each sample at appropriate time points during the sequence run. Researchers observed that PANGO lineages and VoC/VuI assignments were consistent in all the 454 samples studied using nanopolish and medaka.
Scientists stated that minoTour could be installed on a single laptop or a computer running sequences. The sequences can also be run at a central hub with data uploaded via multiple devices concurrently. However, the minIT/MK1C device cannot be configured to run minoTour, but users can upload data using minFQ. Scientists have engineered the ARTIC pipeline in minoTour so that any ARTIC compatible primer scheme can be integrated and utilized for analysis.
This study revealed that minoTour could be extended to include real-time analysis using the best available practices for SARS-CoV-2 sequencing. By developing an algorithm, researchers were able to predict the performance of a sequencing run and better define when to terminate sequencing. This approach is not only time-saving but also reduces the cost per sequencing.
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.
- Munro, R. et al. (2021) "Real-time monitoring and analysis of SARS-CoV-2 nanopore sequencing with minoTour." bioRxiv. doi: 10.1101/2021.09.13.459777.