Scientists at Duke University have created the first map of imprinted genes throughout the human genome, and they say a modern-day Rosetta stone - a form of artificial intelligence called machine learning - was the key to their success.
The study revealed four times as many imprinted genes as had been previously identified and is featured on the cover of the December 3 issue of Genome Research.
In classic genetics, children inherit two copies of a gene, one from each parent, and both actively shape how the child develops. But in imprinting, one of those copies is turned off by molecular instructions coming from either the mother or the father. This process of “imprinting” information on a gene is believed to happen during the formation of an egg or sperm, and it means that a child will inherit only one working copy of that gene. That's why imprinted genes are so vulnerable to environmental pressures: If the only functioning copy is damaged or lost, there's no backup to jump in and help out.
Many of the newly-identified imprinted genes lie within genomic regions linked to the development of major diseases like cancer, diabetes, autism, and obesity. Researchers say that if some of these genes are later shown to be active in these disorders, they may offer clues to better disease prevention or management.
“Imprinted genes have always been something of a mystery, partly because they don't follow the conventional rules of inheritance,” says Dr. Randy Jirtle, a genetics researcher in the departments of radiation oncology and pathology at Duke and a senior author of the study. “We're hoping this new roadmap will help us and others find more information about how these genes affect our health and well-being.”
The technical wizardry needed to find the genes fell to Dr. Alexander Hartemink, the other senior author of the study and an assistant professor in Duke's department of computer science, and Philippe Luedi, the first author of the study. They fed sequence data from two types of genes – ones known to be imprinted and ones believed not to be imprinted – into a computer and asked it to discover the differences. This machine learning approach led to an algorithm, which was able – like the original Rosetta stone – to decode seemingly impenetrable data, in this case, specific DNA sequences that pointed to the presence of imprinted genes.
“We can't say for certain that we identified all of them, but we think we found a large number,” says Hartemink.