Gene expression analysis is a study of the relative frequency of expression of individual genes in a sample. Genes are encoded in DNA, transcribed to RNA, and translated to protein. Messenger RNAs, or mRNAs, are the intermediate step between the gene and the protein. The level of mRNA in a sample indicates the level of gene expression. The more mRNAs, the higher the level of expression.
Credit: CI Photos/ Shutterstock.com
Gene expression studies have been done by a variety of methods, including DNA microarray, fluorescence in situ hybridization, RNA-seq, tiling arrays, massive parallel signature sequencing (MPSS), and serial analysis of gene expression (SAGE). The latter two methods, MPSS and SAGE, have a lot of similarities in technique and application. Both are techniques that can be used without prior sequence information.
Massive parallel signature sequencing
In MPSS, mRNA transcripts are captured on individual microbeads through a complementary DNA signature sequence. The microbeads are analyzed in array format in a flow cell. Bases of the mRNA are systematically read by hybridization to a fluorescently labeled coder and then removed. Successive rounds of identification and removal are carried out until the sequence is complete. The result is an array of sequences ranging from 17 to 20 bp. Because thousands can be sequenced at one time, it is referred to as massively parallel.
The count of transcripts, indicating expression level, is determined by the number of transcripts present per million molecules. MPSS does not require that genes are identified and characterized before beginning analysis. The sensitivity of MPSS is a few molecules of mRNA per cell.
SAGE uses mRNA from a particular sample to create complementary DNA (cDNA) fragments which are amplified and sequenced using high-throughput sequencing technology. The mechanism behind SAGE is based on tags which can identify the original transcript, and rapid sequencing of chains of tags linked together. The procedure essentially simplifies sequencing by linking the cDNA segments together in a long chain. The resulting analysis gives a snapshot of the transcriptome of the sample, including the identity and abundance of each mRNA.
MPSS has a few advantages over SAGE. One is the length of the base pair sequence used to identify transcripts. SAGE generates a tag that is about 14 nucleotides, whereas MPSS uses a 17 nucleotide signature. Signature lengths of 14 nucleotides are only 80 percent unique, while the 17-nucleotide signature is about 95 percent unique. As well, MPSS has a dataset with greater depth, which may be more useful for mRNAs expressed at a very low level. MPSS has the capacity for a library of 1 million signature tags. That’s about 20 times the size of a SAGE library.
Both techniques use transcript count per million molecules to represent gene expression and both have very high throughput.
Applications for SAGE and MPSS are quite similar. Gene expression analysis has been used across a wide range of organisms and disease states. Cancer-specific expression of genes has been used to identify disease markers and potential targets for therapy. Tissue-specific expression can reveal the underlying biology of the cells.