Shotgun sequencing is performed by randomly fragmenting DNA sequences into small pieces, followed by computer-based reassembly of DNA fragments by finding overlapping ends. This technique is used for genomic, transcriptomic, and proteomic sequencing.
Image Credit: CI Photos/Shutterstock.com
What is sequence analysis?
Sequence analysis is a type of computational analysis of DNA, RNA, or protein sequence to determine the biological properties, structure, functions, and evolution of the target sequence.
Next-generation sequencing (NGS) is a high throughput DNA sequencing that has made it possible to map the entire human genome within a short period. In addition to sequencing the entire genome (whole genome sequencing), sequences of multiple small DNA fragments can be analyzed through NGS, and bioinformatic approaches are then used to align these fragments using the human reference genome.
Nowadays, scientists mainly focus on sequencing a subset of the entire genome instead of sequencing the whole genome, which is time-consuming and highly expensive. Such subset analysis is called targeted sequencing wherein multiple genetic regions of interest are isolated or enriched from whole-genome preparations and are subjected to NGS.
What is shotgun sequencing?
Shotgun sequencing is the most efficient technique to sequence large DNA pieces, which are randomly fragmented into multiple smaller pieces. These small fragments are then sequenced individually, and the resultant sequence data are analyzed using computer-based programs that look for DNA regions with identical sequences.
These identical regions are then overlapped and ligated with one another. This process is repeated several times until the entire sequence of the starting DNA piece is obtained.
Shotgun sequencing is particularly effective in sequencing multicellular genomes as they are more difficult to clone because of the large genome size and structural complexity. In contrast to clone-based sequencing, shotgun sequencing is much faster and less expensive.
In many cases, shotgun sequencing is used to remove errors, make corrections, and improve the accuracy of existing clone-based sequences, including the reference human genome.
The shotgun sequencing approach has been applied to the official Human Genome project, in which the human DNA was cloned first into yeast artificial chromosomes and bacterial artificial chromosomes, followed by mapping of the genes to their chromosomal locations and implementation of shotgun sequencing.
What is shotgun transcriptome sequencing?
Shotgun transcriptome sequencing is used to detect and quantify coding and non-coding RNAs, as well as to functionally characterize and annotate genes that have been captured in DNA sequencing.
Whole transcriptome analysis by shotgun sequencing, also known as RNA sequencing, can also be used to form gene-to-gene interaction networks to understand the functionality of various biological systems.
In simple words, whole transcriptome shotgun sequencing helps make a blueprint of the transcriptome that includes the entire population of cellular RNAs (mRNA, tRNA, and rRNA). This makes it possible to determine the level of gene expression and the status and timing of gene activation patterns.
Such sequencing techniques are particularly used for determining single nucleotide polymorphism, RNA editing, alternative splicing events, transcriptional network, differential gene expression, and post-transcriptional modifications (polyadenylation and 5’ capping).
How is shotgun transcriptome sequencing performed?
At first, single-stranded RNAs (mRNAs) are used to sequence cDNA fragments (a cDNA library), and functional elements necessary for sequencing are added to each end of the cDNA fragments. The resultant cDNA library is then subjected to shotgun sequencing, which produces short sequences corresponding to the ends of the fragment. Both single-read or pair-end sequencing techniques can be used for cDNA library sequencing.
In single-read sequencing, the cDNA is sequenced from only one end, which makes the technique relatively cheaper and faster to perform. In contrast, the pair-end technique allows cDNA sequencing from both ends, which makes the technique expensive and time-taking.
In addition, two types of procedures can be applied for sequencing: strand-specific and non-strand-specific procedures.
In the strand-specific procedure, information about the transcribed DNA strand is retained; whereas, the non-strand-specific procedure does not specify which DNA strand corresponds to the original mRNA.
The transcript data (reads) obtained from the sequencing are aligned to the reference genome and analyzed using different software packages.
What are the disadvantages of shotgun sequencing?
Although time-consuming clone-by-clone sequencing steps of conventional sequencing methods can be avoided, shotgun sequencing requires high-class computational analytical power and sophisticated software packages to align and analyze shotgun sequences.
Since no genetic map is used to assemble sequences, the chance of error is relatively high in shotgun sequencing. However, these errors can be easily resolved by using a reference genome. A reference genome is particularly needed for whole-genome shotgun sequencing; otherwise, sequence alignment becomes very difficult.
Sequences that are present in multiple copies in the genome are difficult to assemble in shotgun sequencing.