Comparative genomics is a field of biology where the genome of different species are compared to each other to understand evolutionary and molecular differences between species. The development of low cost, next-generation sequencing has enabled the analysis of a plethora of related genomes using comparitive genomics. This article aims to describe the techniques used in comparative genomics and their advantages/disadvantages.
vrx | Shutterstock
Genome sequencing and genome comparison
Genetic information is encoded by four nucleosides: adenine, cytosine, guanine, and thymine. Determining the order of these nucleosides in linear DNA forms the basis of sequencing. Along with the human genome, the genomes of several model organisms has now been sequenced - including chimpanzees, mice, fruit flies, puffer fish, roundworms, baker's yeast, and bacteria. In total, the genomes of more than 1000 prokaryotic organisms and 1300 species have been sequenced to date.
The first step in comparative genomics is to compare general features such as: genome size, number of genes and chromosome number. For example, Arabidopsis (a plant) has a smaller genome compared to Drosophila, the fruit fly which has twice as many genes. Interestingly, the genome size of Arabidopsis is similar to humans, suggesting that genome size is not an indicator of complexity or evolutionary status.
DNA sequencing and synteny
Synteny is a method by which genes are arranged in similar blocks across species to identify similar and dissimilar regions. The extent of similarity and dissimilarity may vary across the chromosomes. For example, chromosome 20 of humans corresponds almost completely to the second chromosome of the mouse.
Similarly, the seventeenth chromosome of humans corresponds with chromosome 11 of the mouse. Thus, analysis can show how chromosomal changes that have happened in mouse and human chromosomes since they diverged from a common ancestor almost 75–80 million years ago.
Homologous DNA analysis
Another method used in comparative genomics is homology analysis, where homologous chromosomes of different species are aligned. For example, in one study, the gene for enzyme pyruvate kinase in humans was aligned with the homologous enzyme sequence form dog, mouse, chicken, and zebrafish (among others), and subsequently, the regions of high sequence similarity were plotted.
Such as analysis showed high similarity in the enzyme sequences of human and macaque (a primate), whereas chicken and zebrafish showed similarity only in the coding regions. Such analysis can be used to find which genomic features have been preserved during the course of evolution and, conversely, which features have diversified.
Phylogenetic distance is a non-parametric feature used to measure the degree of separation between two organisms. This parameter is based on the number of sequence changes that have accumulated over a period of years or generations. This distance is inversely proportional to the sequence similarity between the organisms – i.e. less the sequence similarity, more is the phylogenetic distance between them.
Over longer phylogenetic distances (such as one billion years) since the organisms separated, only general inferences can be gathered. However, for closer phylogenetic distances, such as 50–200 million years since separation, functional and non-functional DNA may be discriminated, which can subsequently lead to the identification of coding regions, non-coding RNAs, regulatory regions, etc.
For phylogenetic distances less than 5 million years, sequence differences can be used to infer smaller and subtle differences in shape and form. Therefore, comparative genomic differences can provide a lot of powerful information.
Advantages of comparative genomics
Comparative genomes have led to interesting insights, such as that human genome and fruit fly share 60% of their genes. Also, almost two-thirds of the cancer genes have homologous genes in the fruit fly. Such results have a profound impact on human health research. Apart from health, it also has implications in various fields like agriculture, biotechnology, zoology, conservation biology, etc.
The Power of Comparative Genomics