The genetic blueprint for all life forms is in the form of nucleic acid, the most common being deoxyribonucleic acid (DNA). This chemical carries within its structure the ability to encode all the thousands of proteins and other structural and functional elements required to build the organism’s body as well as to operate every life process.
Image Credits: Jezper / Shutterstock.com
However, these coding regions or genes that are responsible for the actual production of proteins make up only about 1.5% of an organism’s DNA. The rest is composed of non-coding DNA, sometimes referred to as junk DNA.
However, junk DNA is now known to have many other essential functions, such as regulating gene expression by turning the encoding sequences on or off. Other portions control or modulate the level of genes that are being decoded. Thus, far from being junk DNA, this is better called functioning DNA, though many of its functions are still being discovered.
There are several types of non-coding or junk DNA. Some of these are described below.
Non-coding RNA genes
Some of the non-coding DNA is transcribed into, or forms, a chemically related species called RNA, which is the real messenger of the genetic blueprint to the cell. These molecules include transfer RNA, ribosomal RNA, and messenger RNA, and are all involved in the production of proteins, or translation of DNA into the final protein product, within the cell. They themselves are not proteins and do not directly give rise to proteins, unlike the protein-encoding gene sequences within the DNA. However, the DNA sequences coding for these RNA molecules are obviously not junk.
Other examples include Piwi-interacting RNA and microRNA. It is thought that microRNAs are the regulators of translational activity of almost one-third of all protein-coding genes among mammals. They are being investigated for their possibly crucial roles in the progression of certain diseases such as cancer and heart disease, as well as in the immune response to infective organisms entering the body.
Another class of specialized RNA is the long non-coding RNA that has multiple roles in gene regulation, including during chromatin remodeling, transcription, post-transcriptional regulation, and as the source of siRNAs.
Regulatory elements and introns
Non-coding DNA is also found in the form of cis- and trans-regulatory elements that modulate gene transcription. They are found either within introns or in the untranslated regions at the 5' or 3' ends of the gene. Cis and trans refer to their location within and between chromosomes, respectively.
An intron is a stretch of non-coding DNA incorporated into the gene sequence itself. Introns are therefore non-coding DNA by definition, and are transcribed into the preliminary messenger RNA molecule, but are then removed to give rise to the mature form. They may play regulatory roles in controlling the activity of tRNA and rRNA as well as of the protein-encoding segments, or codons. However, most introns are not functional.
All genes have a regulatory site called a promoter sequence which is a non-coding DNA segment that is bound by proteins involved in the process of transcription. Such promoter sequences do not give rise to any part of the final protein, but facilitate the transcription of a particular gene and are usually found upstream of the coding region.
Enhancer sequences also influence the likelihood that a gene will be transcribed. Proteins that activate transcription bind to these short sequences. On the other hand, inhibitory sequences (silencers) may also be present which are open to binding by inhibitory proteins that repress or reduce the chances of transcription. Silencer sequences are found a little distance away from the gene they regulate, either before or after it.
Super-enhancers are clusters of enhancer sequences bound together by physical or functional association, and which are tied to the regulation of genes that are vital for the cell’s identity, such as the transcription factors that determine the type and lineage of the cell.
Both types of regulatory elements may be present in some genes which require a high degree of regulation.
Insulator sequences also bind regulatory proteins that act in several ways, as by preventing the action of enhancers and thus restricting the number of genes in that set, or by inhibiting structural DNA changes that could repress the activity of the gene concerned. These are called enhancer blockers and barrier insulators, respectively.
Another type of non-coding DNA is the pseudogene, which is a DNA sequence that resembles an existing gene but is nonfunctional. These are thought to be the result of mutations in functional genes that prevent their forming of functional proteins or inhibit their transcription. They could also arise as a result of retro-transposition. Most seem to be nonfunctional.
Some viral infections may also result in non-coding DNA as the result of reverse transcription. This process describes what happens when an RNA -carrying virus like HIV infects a cell. It copies its RNA in the form of DNA on to the host DNA so that it can make the host cell carry out the various operations required to replicate and proliferate. These virally derived DNA sequences may undergo mutations later, which lead to their inactivation, forming pseudogenes.
Another specialized type of non-coding DNA is the transposon, a mobile genetic element that can change its location in the genome. By shifting its location, it may correct a mutation or induce one. In either case, it changes the size of the cell’s genome. Transposable elements make up the major part of non-coding DNA. These include LINEs, SINEs, satellite DNA, and VNTRs.
LINEs, or Long INterspersed Elements, are moderately repetitive, non-coding regions possibly derived from viruses. SINEs, or Short INterspersed Elements, are highly repetitive, non-functional regions that may be the result of reverse transcription of RNA.
Satellite DNA and telomeres
Telomeres are segments of repeating nucleotides forming specialized DNA segments found at the ends of all chromosomes. These are important in preserving the chromosome’s structural integrity during the process of DNA replication, by keeping the ends from being degraded.
Satellite DNA is a term used for tandemly repetitive DNA regions clustered in an area. This type of non-coding DNA is found in centromeres, the vital structures that link the members of a chromosome pair during cell division. It is also present in the form of heterochromatin, a densely packed form of DNA that regulates gene activity as well as preserves chromosome structure. VNTRs or Variable Number of Tandem Repeats are repetitive elements but shorter than seen with satellite DNA.
In short, a great deal of study is required to find out more about how and what different types of non-coding DNA do.
- Nih.gov. (2019). What is noncoding DNA? https://ghr.nlm.nih.gov/primer/basics/noncodingdna
- Alexander F. Palazzo, T. Ryan Gregory (2014). The Case for Junk DNA. PLoS Genet 10(5): e1004351. https://doi.org/10.1371/journal.pgen.1004351