In this interview, News-Medical talks to Professor John Rossen of the University of Groningen, about his research into DNA sequencing and microbiology and how his work is benefitted by Tecan.
Could you start by giving our readers a brief overview of your background and the history of DNA sequencing?
I am appointed as a Professor at the University of Groningen and the University of Utah, School of Medicine. I am also currently employed at IDbyDNA, an organization that is designing and selling software to analyze metagenomics data.
Sequencing essentially started once DNA had been discovered. The first bacterial genome was sequenced in 1995, and the field developed very rapidly from then on, with many new technologies being discovered and developed. Nowadays, we are able to sequence dozens of bacteria in parallel in order to generate whole-genome sequences.
Sequencing possibilities within clinical microbiology really began with whole-genome sequencing. Whole-genome sequencing makes use of a bacterial culture, from which we can extract DNA, create DNA libraries, start sequencing, type pathogens, and interpret results.
What are some of the benefits and applications of next-generation sequencing in particular?
Next-generation sequencing (NGS) can be used to detect outbreaks, for example. In one case study in 2014, a hospital saw an increase in vancomycin-resistant Enterococcus faecium across several different wards. We sequenced all the isolates obtained from the patients and saw that data was clustered.
If we had used conventional multi-local sequence typing (MLST) with only seven alleles, we would have had less discriminatory power, and these results would all have been in the same cluster.
However, we used the core genome MLST to examine far more alleles, increasing discriminatory power and seeing different clusters. This is essentially what NGS can provide, and this is just one example of how it has been advantageous in clinical microbiology.
NGS also offers a number of other potential applications, for example, the amplicon-based sequencing workflow.
When working with a patient sample, it is possible to include a positive and negative control, similar to the process when using real-time PCR. It is also possible to spike the patient sample with internal control and then extract RNA and DNA to perform PCR. This is extracted directly from clinical material, meaning there is no culture involved.
NGS methods can be used to amplify certain targets, for example, 16S or AMR genes. After amplification, we can start sequencing the amplicons, detect selected pathogens and interpret the results to generate a laboratory report.
An early example of an amplicon-based next generation sequencing approach was actually developed by a company called BioInnovation Solutions, previously Pathogenica.
The company designed a panel that had several hospital-acquired infectious bacteria present as well as several AMR genes. The technology used 298 probes, so it equaled a total of 144 PCRs.
We were able to generate results in 24 to 36 hours using this system, publishing a number of our findings. Unfortunately, this company no longer exists, and despite the methodology being very promising, it was not adopted in the field at that time.
How did this work link into the development and application of metagenomics in clinical microbiology?
Based on our findings, we were able to use clinical shotgun metagenomics to introduce NGS into the clinical microbiology lab, not only in terms of typing bacteria but also for applications involving real diagnosis.
Metagenomics is culture-independent and unbiased, a key advantage over traditional whole-genome sequencing.
We can use metagenomics to detect bacteria and microbes. We can use it to examine host response, prototyping and Epi-typing. We can also use it for the identification of pathogens, something that has historically been challenging when using bacterial whole-genome sequencing.
The cost of these technologies is slightly higher compared to whole-genome sequencing. The turnaround time is varied, depending on the tools being used.
What should a lab consider when looking to introduce metagenomics into its clinical microbiology applications?
When introducing metagenomics into a clinical microbiology lab, a key consideration is to ensure that comprehensive databases are available. These should be extensively curated and validated. Virtual panels offer the option to use syndrome-based filters to better interpret results; for example, filtering results based on urinary tract infections or respiratory tract infections.
If the goal is to use metagenomics for pathotyping, databases should also include bacterial and fungal resistant markers, as well as antiviral resistant markers.
Quantification is an important consideration, and the ability to automate reporting is helping in developing streamlined reporting processes that are easily interpretable by clinicians.
What does a typical metagenomic workflow look like?
In a typical clinical micro-metagenomic workflow, we would start with a clinical sample. The positive and negative controls are defined, and we spike the internal control extract into the clinical sample (RNA and DNA).
We generally sequence everything that is present in the sample so we can detect any existing pathogens, but this will also include the normal flora.
At this point, we need to prioritize the results in order to generate a laboratory report. We should have information on the organisms detected, so we need to prioritize the results based on previously established thresholds before generating the diagnostic report.
In order to do this, it is useful to have information on the pathogens. My current company has mined over 13 million publications in PubMed, using co-occurrence of the organism’s name and the disease terms.
This has allowed us to link different pathogens to different infectious diseases, ultimately making it much easier to prioritize the pathogens found in clinical samples.
In practice, laboratory staff interested in a specific infection can simply apply a filter that takes into account the data mining done before.
For example, users can specifically focus on pathogens that are involved in gastrointestinal infections, and the report will only feature those pathogens that have been identified as being the cause of gastrointestinal infection in the literature.
Could you give our readers an example of this process used in clinical practice?
In one case study, clinical metagenomics was used to diagnose urinary tract infections. A total of 115 culture positive and negative routine urine samples were used for internal control. Those samples were spiked with two phages - T4 and T7.
DNA was extracted, and we attempted to deplete the host DNA to make the method more sensitive. Shotgun DNA sequencing was performed, and results were filtered down to 97 known uropathogens.
This included 8 of the most common uropathogens, as well as 77 additional bacteria and fungi. We employed artificial intelligence and machine learning to continuously optimize this profile, reporting the results in a semi-quantitative format and predicting antibiotic resistance.
We focused on results from cultures that have a bacterial load of more than 105 colony-forming units per ml. This is an important threshold in clinical microbiology, often because this is the cut-off whereby a clinician will likely decide that it is a specific infection, in this instance, a urinary tract infection.
Depending on the patient population in question, looking at the culture results and comparing these with what we find using clinical metagenomics has revealed that there is an almost 95% co-occurrence with the culture results in cases where there is more than 10 to the 5th colony forming units per mil.
This was found to be less when there was a lower load of bacteria present in the urine samples, decreasing to 47% of the cases where the culture was positive, but shotgun metagenomics was not.
How deep should the sequencing be in order to yield useful results?
This very much depends on the case in question. To continue the example above, here we began with over five million reads. We down-sampled to around two million samples and trimmed down to 200 base pairs (bp). This provided good results for several bacteria, but we did start to lose some of the positivity rates.
In this case, Klebsiella pneumoniae was present in 9 out of 10 samples, but when used only the 200 bp reads, only 7 out of 10 were positive. This further decreased when we down-sampled to shorter reads, trimming to 100 base pairs.
This example indicates that for shotgun metagenomics, it is generally necessary to sequence quite deep. About 8 million reads per specimen is a good average, and it is possible to make this more sensitive or sequence it deeper, but this does incur additional cost.
What are some examples of metagenomics being used to investigate respiratory infections?
In one highly illustrative example, an elderly man came to a California emergency room with trauma-related injuries. He had a subdural hematoma and a radial fracture, but he also had hypoxia. When a CT chest scan was performed, the hospital staff saw a number of lung opacities with ground-glass appearances.
He was intubated for hypoxic respiratory failure and admitted to the intensive care unit. Tracheal aspirate was sent into the clinical lab, and qPCR was able to detect SARS-CoV-2 in this patient in real-time. When we used the sample for clinical metagenomics, SARS-CoV-2 was also found.
Because we covered the full genome of this virus, we were able to characterize it in more detail. We were able to perform phylogenetic analysis to determine that the virus was closely related to the clade G viruses and other strains that were known to circulate within the United States.
A second case study involved a 12-year-old male with a history of obesity and Type 2 diabetes. He presented to his primary care provider with right-sided pleurisy. A chest x-ray showed a lower lobe infiltrate, so the patient was admitted overnight and given some antibiotics.
Initial diagnostics were done via a PCR with a respiratory panel, showing that this patient was infected with HCoV-OC43, although this did not explain his pneumonia.
The patient improved and was discharged the following day, but he came back three days later, returning to the emergency room with worsening symptoms. When he was admitted to the hospital, new diagnostics were performed - pleural fluid was taken, and all aerobic and anaerobic cultures were found to be negative.
We used shotgun metagenomics, and this detected Prevotella pleuritis, a rare pathogen that has only recently been recognized as being able to cause pneumonia. Having gained this new insight, the patient was placed on anaerobic coverage with Meropenem, and he improved markedly. This shows clinical metagenomics is highly unbiased and can detect co-infections.
In one final case study, a 16-year-old female with a history of leukemia (who had had chemotherapy) presented with a septic shock. Clostridium difficile colitis was also seen. She was seen at the California Hospital for Respiratory Distress and admitted to the PICU, where lung abnormalities were also noted.
A range of diagnostics was performed, including galactomannan assay, bacterial/fungal cultures, and BAL with a respiratory viral panel. Next generation sequencing was also done on plasma cell-free DNA.
All the initial diagnostic tests were found to be negative, but investigation with metagenomics showed Mycoplasma hominis. This could be treated with clindamycin, so treatment was started, and the patient improved tremendously.
These examples really illustrate how powerful these kinds of technologies are.
What advantages do capture probes offer over amplicon sequencing?
Capture probes are more tolerant of mismatches and target sequences than PCR, primers, and probes. They are suitable for use with high diversity targets like RNA viruses. If designed in a tiled way, they can also cover the whole genome.
This kind of enrichment-based sequencing starts with the standard process of patient sample, the positive and negative control, spike in internal control, extraction of RNA and DNA, and the preparation of libraries.
These pre-enriched libraries are enriched using hybrid capture probes, and then sequencing begins.
An interesting aspect of this method is that we can sequence the enriched libraries and if these are found to be negative, and we want to see if something is present that was not initially targeted by the enrichment panel, it is possible to return to the pre-enriched sequencing libraries and sequence those using shotgun metagenomics.
Once the enriched libraries are sequenced, we can detect all enriched pathogens, prioritize results and generate a laboratory report.
The number of viral reads will typically increase considerably after enrichment, resulting in higher coverage of the sequencing and facilitating coverage of the whole genome.
Having the whole genome available allows us to look at mutations in the genome, something that may be especially important when investigating pre- and post-vaccine errors.
Having access to the whole genome also allows us to look at mismatches or mutations that may make the virus more virulent or that may allow a virus to escape from a potential vaccine in the future.
How have you used your findings so far to enhance the detection of respiratory pathogens?
We have been developing a broader range of tools for the detection of respiratory pathogens.
These include AMR genes and the Oligo pro-based enrichment panel, in which RNA and DNA targets can be processed in single library preparation, effectively enabling simultaneous detection and differentiation of over 80 bacteria, 42 viruses, 53 fungi, and more than 1200 AMR markers.
Using these tools, it is possible to acquire the full genome for SARS-CoV-2 and other influenza viruses.
Enrichment vastly improves the rates of detection. To provide an example, in one case study, 17 out of 29 samples that were known to be positive (by standard of care testing) were also confirmed as positive when investigated using clinical shotgun metagenomics.
After enrichment, 29 out of 29 positive samples were found to be positive, detection of various respiratory pathogens in clinical sample libraries was significantly improved by enrichment, even when working with new technologies or instruments.
Next generation sequencing can ultimately be considered as the first diagnostic one-stop-shop in personalized clinical microbiology. It can be used for tracking outbreaks and identifying sources of recurrent infections.
It can also predict resistance or virulence of phenotypes from genome sequencing, which enables optimal therapy. Next generation sequencing can also detect mutations in the microbes’ genome, potentially explaining therapy failures or vaccine-escaping mutants. It also offers an unbiased and culture-free way to identify pathogens.
Enrichment increases sensitivity while reducing cost because users do not have to sequence as deep. In the future, these kinds of technologies will also be able to be used to understand pathogen-microbiome/virome interactions.
About Professor Rossen
John Rossen has a 30- year history in molecular virology and microbiology and is Professor for Medical Microbiology – in particular Personalised Microbiology – at the University of Groningen.
His Personalised Microbiology group has implemented NGS into clinical microbiology and infection prevention, and is now focusing on implementing metagenomics and metatranscriptomics.
These methods are applied to patient, animal, food and water samples, characterizing micro-organisms (including viruses) and the interaction between them, as well as with their host. He is immediate past president of the ESCMID study group for genomic and molecular diagnostics (ESGMD), and board member of the Dutch Society of Medical Microbiology.
Tecan is a leading global provider of automated laboratory instruments and solutions. Their systems and components help people working in clinical diagnostics, basic and translational research and drug discovery bring their science to life.
In particular, they develop, produce, market and support automated workflow solutions that empower laboratories to achieve more. Their Cavro branded instrument components are chosen by leading instrumentation suppliers across multiple disciplines.
They work side by side with a range of clients, including diagnostic laboratories, pharmaceutical and biotechnology companies as well as university research centers. Their expertise extends to developing and manufacturing OEM instruments and components, marketed by their partner companies. Whatever the project – large or small, simple or complex – helping their clients to achieve their goals comes first.
They hold a leading position in all the sectors they work in and have changed the way things are done in research and development labs around the world. In diagnostics, for instance, they have raised the bar when it comes to the reproducibility and throughput of testing.
In under four decades Tecan has grown from a Swiss family business to a brand that is well established on the global stage of life sciences. From pioneering days on a farm to the leading role our business assumes today – empowering research, diagnostics and many applied markets around the world