Only 1−2% of the human genome encodes protein coding genes. The rest of the genome consists of non-coding RNA, untranslated regions, splice sites and transposable elements. Most of the functions of these elements are unknown.
Image Credit: Black Prometheus / Shutterstock
Non-coding RNAs or ncRNAs are transcripts which do not code for proteins. Micro RNAs (miRNAs) are small ncRNA with 18−25 nucleotides. They can bind to complimentary regions on mRNA and prevent their translation and reduce the stability.
Deletion of miRNA has also been associated with cancer progression. Apart from deletion, point mutations in miRNA can also affect miRNA processing and its target recognition of the mRNA sequence.
Long non-coding RNA
More than 50,000 non-coding RNAs are transcribed in the human genome which are not translated in to proteins. These non-coding RNAs are mostly longer than 200 nucleotides in length; hence they have been termed long non-coding RNAs (Lnc RNA).
Although they do not code for proteins, studies have uncovered critical roles they play in the regulation of several processes inside the cell. Lnc RNA can be present in nucleus or cytoplasm where they have been shown to regulate the cell cycle, cell differentiation, proliferation, and transcriptional regulation of gene expression. Lnc RNA can act by recruiting epigenetic effectors which can modify the expression of protein coding genes without altering the DNA sequence.
A large portion of the non-coding regions are constituted of “jumping genes” or “transposable elements”. These regions can “jump” from one region of the genome to another. Several functions have been attributed to these genes. They can encode regulatory sequences which in turn regulate the expression of protein coding genes.
As these genes can move and insert themselves in to different regions, they can sometimes enhance, reduce, or totally stop the expression of coding sequences based on where they get inserted. For example, some of these genes have been found to be involved in the neurodegenerative disease Amyotrophic Lateral Sclerosis (ALS).
Although regulatory regions do not code for proteins, they contain promoters and enhancers which can influence the expression of coding genes. Also, any structural alterations in these regions, such as translocations, deletions, insertions, or duplications can lead to changes in the interaction between the regulatory elements and coding genes. Many of them are also present in the vicinity of oncogenes and regulate their activation or repression.
5’-Untranslated regions (5’-UTR)
5’-UTR as the name suggests, are sequences which are not translated, and they lie adjacent to the coding regions in mRNA. Although functions of all the 5’-UTRs are not known, many of them have been found to regulate translation or mRNA stability through different mechanisms.
They can also influence translation of coding regions by reducing the access of translational machinery to the coding regions. Mutations in this region can also lead to creation of initiation codons. For example, generation of premature start codons by mutations in 5’-UTR have been shown to create melanoma. But the functional characterization of 5’-UTR and their mutations is still incomplete.
Introns and splice sites
Introns are also non-coding regions, and often mutations and alterations in introns and intronic splice sites do not receive much attention. However, changes in the splice sites in introns can lead to deletion of exons or inclusion of introns present next to them.
Many cancers are associated with mutations in intronic splice sites which lead to deletion of essential exons. Introns may also contain regulatory elements, and mutations may lead to destruction of those sequences leading to change in the gene expression.
Although the non-coding region constitutes almost 98% of our genome, they may contain important regulatory factors which control the levels and expression of the 2% of the coding regions.