Introns and exons are nucleotide sequences within a gene. Introns are removed by RNA splicing as RNA matures, meaning that they are not expressed in the final messenger RNA (mRNA) product, while exons go on to be covalently bonded to one another in order to create mature mRNA.
Introns can be considered as intervening sequences, and exons as expressed sequences.
There are an average of 8.8 exons and 7.8 introns per human gene.
DNA structure illustration. Liya Graphics / Shutterstock
What are Exons?
Exons are nucleotide sequences in DNA and RNA that are conserved in the creation of mature RNA. The process by which DNA is used as a template to create mRNA is called transcription.
mRNA then works in conjunction with ribosomes and transfer RNA (tRNA), both present in the cytoplasm, to create proteins in a process known as translation.
Exons usually include both the 5’- and 3’- untranslated regions of mRNA, which contain start and stop codons, in addition to any protein coding sequences.
What are Introns?
Introns are nucleotide sequences in DNA and RNA that do not directly code for proteins, and are removed during the precursor messenger RNA (pre-mRNA) stage of maturation of mRNA by RNA splicing.
Introns can range in size from 10’s of base pairs to 1000’s of base pairs, and can be found in a wide variety of genes that generate RNA in most living organisms, including viruses.
Four distinct types of introns have been identified:
- Introns in protein coding genes, removed by spliceosomes
- Introns in tRNA genes, which are removed by proteins
- Self-splicing introns, which catalyse their own removal from mRNA, tRNA, and rRNA precursors using guanosine-5'-triphosphate (GTP), or another nucleotide cofactor (Group 1)
- Self-splicing introns, which do not require GTP in order to remove themselves (Group 2)
It is vital for the introns to be removed precisely, as any left-over intron nucleotides, or deletion of exon nucleotides, may result in a faulty protein being produced. This is because the amino acids that make up proteins are joined together based on codons, which consist of three nucleotides. An imprecise intron removal thus may result in a frameshift, which means that the genetic code would be read incorrectly.
This can be explained by using the following phrase as a metaphor for an exon: “BOB THE BIG TAN CAT”. If the intron before this exon was imprecisely removed, so that the “B” was no longer present, then the sequence would become unreadable: “OBT HEB IGT ANC AT…”
RNA splicing is the method by which pre-mRNA is made into mature mRNA, by removal of introns and joining together of exons. Several methods of splicing exist, depending on the organism, type of RNA or intron structure, and the presence of catalysts.
Introns possess a highly conserved GU sequence at their 5’ end, known as the donor site, and a highly conserved AG sequence at the 3’ end, called the acceptor site. A large RNA-protein complex, the spliceosome, made up of five small nuclear ribonucleoproteins (snRNPs) recognise the start and end points of the intron thanks to these sites, and catalyse the removal of the intron accordingly. The spliceosome forms the intron into a loop that can be cleaved easily, and the remaining RNA on each side of the intron is connected. Other types of spliceosomes that recognise unusual or mutated intron sequences also exist, known as minor spliceosomes.
tRNA splicing is far rarer, though does occur in all three major domains of life, bacteria, archaea and eukarya. Multiple enzymes fill the role of snRNPs in a step-wise process, which can vary wildly between organisms.
Self-splicing introns are usually found in RNA molecules that are intended to catalyse biochemical reactions, ribozymes. Group 1 introns are attacked at the 5’ splice site by a nucleotide cofactor, which may be free in the biological milieu or a part of the intron itself, leading to the 3’OH of the adjacent exon to become nucleophilic and thus bond to the 5’ end of another exon, following the formation of the intron into a loop. Group 2 introns are spliced in a similar way, though with the use of a specific adenosine that attacks the 5’ splice site.
Alternative splicing refers to the way that different combinations of exons can be joined together, resulting in a single gene coding for multiple proteins. Walter Gilbert first put this idea forward, and he proposed that the different permutations of exons could produce different protein isoforms. These in turn would have different chemical and biological activities.
It is now thought that between 30 and 60% of human genes undergo alternative splicing. Moreover, over 60% of disease-causing mutations in humans are related to splice errors, rather than mistakes in coding sequences.
One example of a human gene that undergoes alternative splicing is fibronectin, a glycoprotein that extends from the cell into the extracellular matrix. Over 20 different isoforms of fibronectin have been discovered. These have all been produced from different combinations of fibronectin gene exons.