By Sally Roberts, BSc (Hons)
Also known of as high throughput sequencing, next generation sequencing (NGS) is the term used to describe several modern sequencing technologies that enable scientists to sequence DNA and RNA at a much faster rate and more cheaply that Sanger sequencing, the technique previously used. NGS has revolutionized the study of molecular biology and genomics.
Nucleic acid sequencing is a technique used to determine the exact sequence of nucleotides within a given molecule of DNA or RNA. First generation sequencing or Sanger sequencing was the technique used to complete the Human Genome Project, which finished in 2003.
Once the first human genome sequence had been elucidated, researchers started to demand techniques that would enable them to complete sequencing more rapidly and at a cheaper cost. This led to the development of NGS platforms; platforms that could provide massively parallel sequencing and enable millions of DNA fragments to be sequenced simultaneously. Using this technology, researchers can achieve a throughput high enough to sequence a whole genome in just a single day.
NGS platforms achieve sequencing in different ways, but several of the most commonly used platforms are based on a similar methodology, which involves template preparation, followed by sequencing and imaging and, finally, data analysis.
Template preparation refers to the building and amplification of a nucleic acid library, which may be made up of DNA or complimentary DNA. To construct the sequencing library, the nucleic acid sample is fragmented and the ends of the DNA fragments are ligated with chemically synthesized DNA molecules of which the nucleotide sequence is already known. Once a library has been built, it needs to be amplified before sequencing can be performed.
Sequencing and imaging
Fragments in the library provide a template for the creation of a new DNA fragment. The fragments are washed and flooded sequentially with the known nucleotides. As the nucleotides become incorporated into the new, growing strand of DNA, they are recorded as sequence information.
Following sequencing, the new sequence data needs to be analyzed. The data needs to be pre-processed in order to remove poor-quality reads and adapter sequences. The data is then mapped to a reference genome and the sequence is analyzed.
The analysis may involve various bioinformatics assessments such as genetic variant calling to identify single nucleotide polymorphisms (SNPs); the identification of novel genes, the detection of mutation events or investigation of the degree of transcript expression.
The software required to analyze sequence data is often freely available online.
NGS has a large number of applications, thereby facilitating rapid technological advances across many fields of biological science. The human genome is being re-sequenced in order to detect genetic factors that contribute to disease. NGS has enabled researchers to gain significant knowledge about a wide range of organisms using whole-genome sequencing.
The technology is also used epidemiology and public health studies, to sequence bacteria and viruses and help detect factors that may contribute to virulence. Furthermore, in gene expression research, NGS of RNA has started to replace microarray analysis, enabling researchers to see the sequence of RNA expression. This RNA sequencing can provide information on a sample’s entire transcriptome in one analysis, without any previous knowledge of an organism’s genetic sequence being required.
These are only a few of the applications in which NGS can aid researchers and clinicians and as the technology continues to become more widely used, more novel applications will inevitably be developed.