Shotgun sequencing is the method used to sequence the human genome by Craig Venter at Celera Genomics. The first method of DNA sequencing, the chain termination method or Sanger sequencing, is limited to a maximum DNA chain length of about 1,000 base pairs. On the other hand, shotgun sequencing increases the total amount of DNA that can be sequenced. It is more of a strategy than a distinct method.
A shotgun approach was first used for early sequencing of small genomes like cauliflower mosaic virus. Later, shotgun methods were adapted (with the development of powerful computer algorithms) for sequencing and reassembling large genomes, most notably the human genome.
DNA sequencing result sheet. Image Credit: SINITAR / Shutterstock
The Sanger Method
Base pairs are the building blocks of DNA. The four nucleotides, or bases, are adenosine, cytosine, guanine, and thymine (abbreviated as A, C, G, and T, respectively). Adenosine pairs with thymine, and cytosine pairs with guanine, so a single stranded chain of AGTTAC would pair with a complementary chain of TCAATG. Strands of these paired chains for the familiar double helix pattern of DNA.
The Sanger method of sequencing is sufficient for reading the sequences of short chains, but is inadequate for longer sequences. The human genome has about three billion nucleotides, and many other genomes and sequences that are of interest to science are also too large for Sanger sequencing.
Shotgun Fragmentation of Sequences
In the late 1990s, Craig Venter adapted the shotgun approach to large genomes. In that method, the DNA is randomly broken into many small pieces, cloned into a bacterial host, and sequenced using chain termination. Multiple rounds of fragmentation and sequencing are carried out, creating overlapping sequences. Powerful computer algorithms are then used to reassemble the sequence.
Venter first developed his shotgun sequencing method while working on the bacterial species Haemophilus influenzae at the National Institutes of Health (NIH) in the US. The project took four months, compared to thirteen years researchers spent sequencing E. coli using Sanger sequencing, and ten years for yeast organisms.
The alternative at the time was to create a low-resolution map of the genome first, and then perform a calculation of the minimum number of fragments needed to sequence the entire genome. The genome was then broken up randomly into fragments and the fragments cloned into bacterial hosts. Based on the map, the cloned fragments were assembled into a scaffold, or tiling path, that theoretically covers the entire sequence, and those fragments were sequenced.
Shotgun sequencing was a more direct alternative, but required a great deal more computing power, pushing the limits of processors available at the time.
Sanger sequencing originally worked in one direction only, starting with the 5-prime end and working toward the 3-prime end. However, sequencing from both ends of the strand increases the read length and can correct for certain kinds of errors in the sequencing process. Shotgun sequencing usually makes use of a paired ends strategy.
Advantages and Disadvantages of Shotgun Sequencing
Shotgun sequencing had a number of important advantages over previous methods:
- Faster because the mapping process was eliminated
- Uses less DNA than other methods
- Less expensive than approaches requiring a map
Some disadvantages of shotgun sequencing include:
- Requires computer processing power beyond what an ordinary laboratory would possess
- Can introduce errors in the assembly process
- Requires a reference genome
- May not be able to assemble repetitive sequences