Mutations within junk DNA linked to autism spectrum disorder

NewsGuard 100/100 Score

Using artificial intelligence, scientists have discovered mutations in parts of non-coding DNA known as “Junk DNA” that can lead to autism. This is the first study of its kind to connect the dots between autism and the human genome. The study titled, “Whole-genome deep-learning analysis identifies contribution of non-coding mutations to autism risk,” was published this week in the journal Nature Genetics.

Using artificial intelligence, scientists have discovered mutations in parts of non-coding DNA known as “Junk DNA” that can lead to autismTimofeev Vladimir | Shutterstock

The research was led by Olga Troyanskaya, deputy director for genomics at the Flatiron Institute's Center for Computational Biology (CCB) in New York City and a professor of computer science at Princeton University. Troyanskaya worked alongside Robert Darnell, a Professor of Cancer Biology at Rockefeller University and an investigator at the Howard Hughes Medical Institute.

The scientists used artificial intelligence software to scan the genome sequences of 1790 people with autism, along with their siblings and parents who did not have autism. Inherited mutations were excluded from the results.

This meant that the genomes of participants with a family history of autism were only picked up on if their DNA contained spontaneous mutations, rather than mutations they could have inherited from their parents.

The AI aspect of the study allowed the team to accurately connect the DNA mutations detected to the development of autism in the individuals.

This is the first clear demonstration of non-inherited, non-coding mutations causing any complex human disease or disorder.”

Olga Troyanskaya, Lead Author

Study co-author, Jian Zhou, said that there are many other diseases such as cancers and heart disease that could be evaluated using these techniques:  “This enables a new perspective on the cause of not just autism, but many human diseases.”

Only around 1 to 2 percent of the genome is made up of genes that encode proteins. These proteins regulate the various functions of cells throughout the body. Much of the remaining non-coding regions serve to regulate gene expression. The scientists noted that some had mutations in the regions that did not code for any proteins while some had mutations in regions with coding functions. Both were similarly associated with autism.

Why artificial intelligence?

Only around 1 to 2 percent of the genome is made up of genes that code for proteins. The proteins then regulate the various functions all over the body. The rest of the genome works by regulating the coding regions of the genes.

When the mutations were seen in the coding regions of the genes, there was a 30 percent association with autism. On the other hand the other cases of autism where there was no positive family history, the connection remained unclear. This prompted them to explore the non-coding regions of the genome to see if there are any connections between the two.  

The team quickly realized that trying to discover mutations in non-coding DNA is comparable to looking for a needle in a haystack! There are often dozens of mutations in non-coding regions of DNA, and many of these will not cause a person to develop a particular disease. This meant that the scientists needed to look outside the box, as traditional genomics tools would fail to detect the correct mutations.

Troyanskaya and her colleagues decided to use artificial intelligence technology to look for sequences that could predict the mutations in the non-coding regions of the genome.

This is a shift in thinking about genetic studies that we're introducing with this analysis. In addition to scientists studying shared genetic mutations across large groups of individuals, here we're applying a set of smart, sophisticated tools that tell us what any specific mutation is going to do, even those that are rare or never observed before.”

Chandra Theesfeld, Co-author

The researchers also noted that specific mutations in the non-coding regions could be linked to different IQs of children on the autism spectrum.

The machine learning model used data from the Simons Simplex Collection from the Simons Foundation. This collection contains records of the whole genome sequences of nearly 2000 “quartets” that are associated with autism in a child and are associated with normalcy in siblings and parents.

These four factors indicate mutations coming spontaneously in the child with autism with no inheritance. The team calculated the predicted effects of these mutations on the sibling that is unaffected by autism. Zhous explains, “The design of the Simons Simplex Collection is what allowed us to do this study. The unaffected siblings are a built-in control.”

Co-author Christopher Park, a research scientist at CCB in a statement said, “. This is consistent with how autism most likely manifests in the brain. It's not just the number of mutations occurring, but what kind of mutations are occurring.”

The team saw that certain mutations when inserted into cells in the lab altered the way genes were expressed and this predicted the outcome of such mutations by the machine.

Troyanskaya added that many diseases could be explored in those 98 percent of non-coding genes saying, “. Right now, 98 percent of the genome is usually being thrown away. Our work allows you to think about what we can do with the 98 percent.”

The authors concluded the study, stating, “predictive genomics framework illuminates the role of noncoding mutations in ASD and prioritizes mutations with high impact for further study, and is broadly applicable to complex human diseases.”


Simons Foundation Press Release. 27th May 2019.

Zhou, J., et al. (2019). Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk. Nature Genetics.

Dr. Ananya Mandal

Written by

Dr. Ananya Mandal

Dr. Ananya Mandal is a doctor by profession, lecturer by vocation and a medical writer by passion. She specialized in Clinical Pharmacology after her bachelor's (MBBS). For her, health communication is not just writing complicated reviews for professionals but making medical knowledge understandable and available to the general public as well.


Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Mandal, Ananya. (2023, March 11). Mutations within junk DNA linked to autism spectrum disorder. News-Medical. Retrieved on April 23, 2024 from

  • MLA

    Mandal, Ananya. "Mutations within junk DNA linked to autism spectrum disorder". News-Medical. 23 April 2024. <>.

  • Chicago

    Mandal, Ananya. "Mutations within junk DNA linked to autism spectrum disorder". News-Medical. (accessed April 23, 2024).

  • Harvard

    Mandal, Ananya. 2023. Mutations within junk DNA linked to autism spectrum disorder. News-Medical, viewed 23 April 2024,


  1. Rhonda Rhonda New Zealand says:

    The sentence "The machine learning model used by the team was called the Simons Simplex Collection from the Simons Foundation." is wrong, the collection is the dataset, not the model used by the team. The team used a Convolutional Neural Network along with a linear regularized model.

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.