New ‘SMART’ system holds potential to accelerate molecular structure identification process

November 8, 2017

An interdisciplinary team of researchers at the University of California San Diego has developed a method to identify the molecular structures of natural products that is significantly faster and more accurate than existing methods. The method works like facial recognition for molecular structures--it uses a piece of spectral data unique to each molecule and then runs it through a deep learning neural network to place the unknown molecule in a cluster of molecules with similar structures.

The patent-pending new system is called "SMART," which stands for Small Molecule Accurate Recognition Technology, and has the potential to accelerate the molecular structure identification process ten-fold. This development could represent a paradigm shift in the chemical analysis, pharmaceutical and drug discovery fields since 70 percent of all FDA-approved drugs are based on natural products such as soil microorganisms, terrestrial plants and, increasingly, marine life forms such as algae.

This work, published in Nature Scientific Reports, represents a collaboration between the UC San Diego Jacobs School of Engineering and the UC San Diego Scripps Institution of Oceanography.

"The structure of a molecule is the enabling information," said Bill Gerwick, professor of oceanography and pharmaceutical sciences at UC San Diego's Scripps Institution of Oceanography. "You have to have the structure for any FDA approval. If you want to have intellectual property you have to patent that structure, if you want to make analogs of that molecule you need to know what the starting molecule is--it's a critical piece of information."

Chen Zhang is a nanoengineering Ph.D. student at the UC San Diego Jacobs School of Engineering and the first author of the new Nature Scientific Reports paper. Zhang said that determining a molecule's structure can be a bottleneck in the natural product research process, taking experts months and even years to accurately determine the correct and complete structure. While each molecule and its identification timeline is different, the SMART approach gives researchers an early clue into what family a new molecule falls under, drastically reducing the time it takes to characterize a new natural product.

"The way we were able to accelerate the process is by essentially using facial recognition software to look at the key piece of information we obtain on the molecules," Gerwick explained. The key piece of information the team uses is something called a heteronuclear singular quantum coherence nuclear magnetic resonance, or HSQC NMR, spectrum. It produces a topological map of spots that reveal which protons in the molecule are attached directly to which carbon atoms, and is unique to every molecule.

Related Stories

Zhang and Gerwick teamed up with Gary Cottrell, a computer science and engineering professor at the UC San Diego Jacobs School of Engineering, to develop a deep learning system trained with thousands of HSQC spectra pulled from the literature. This convolutional neural network takes a 2D image of the HSQC NMR spectrum of an unknown molecule and maps it into a 10-dimensional space clustered near similar molecules, making it easier for researchers to elucidate an unknown molecule's structure.

"Chen took this approach to getting NMR spectra of over 4,000 compounds from the literature by literally cutting out the images from the PDFs of the papers," Cottrell said. "It was an awesome effort! Even so, this is normally not enough data to train a deep network, but we used a technology called a Siamese network, in which you train on pairs of images. This amplifies your training set by roughly the square of the number of compounds in a family, and is what made this project feasible."

This collaboration is the first time Gerwick has mentored an engineering student, and the exchange of ideas proved fruitful.

"It's been a wonderful interaction. UC San Diego has something really quite magical about it, and that is the depth of collaboration that occurs between departments--it's phenomenal," Gerwick said. "When you try and thoughtfully take from another discipline something that is maybe even commonplace in that discipline and apply it in a new and unique way in our discipline, it's an opportunity to really have this kind of paradigm-shifting thing. And I think this technology, with some advancement, could be a real paradigm shift in the way we do all kinds of chemistry and chemical analysis."

The team will get that chance for advancement thanks to a $550,000 grant from the National Institutes of Health to develop efficient methods that facilitate the automated structural classification, feature discovery and structure elucidation of natural products and to build an infrastructure that interacts with data input from the community.

Source:

http://jacobsschool.ucsd.edu/news/news_releases/release.sfe?id=2353

Posted in: Drug Discovery & Pharmaceuticals | Molecular & Structural Biology | Device / Technology News | Medical Condition News

Tags: , , , , ,

Comments (0)

Suggested Reading

Neutron crystallography study could open avenues for new drugs to battle diseases
Novel molecule holds potential to become part of successful HIV vaccine
Fatty acid-derived bioactive molecule helps improve heart function after heart attack in mice
Researchers identify natural compound that appears to shut down cancer cells' energy source
Small-molecule immunotherapeutics could steer immune defense against fungal pathogens
Synthetic compound targets enzyme that supports survival and dissemination of metastatic cells
Porvair Sciences provide optimised microplate products for drug discovery scientists
Liver cancers linked to compound popular in herbal remedies

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News-Medical.Net.
Post a new comment
(Logout)
Post
Advertisement

An alternative approach to augmenting the effect of antibiotics in chronic CF lung infections

There are about 77,000 people known to have cystic fibrosis. That's from the various cystic fibrosis registries available globally. The World Health Organization suggests that this number may be low, because there's no reporting on cystic fibrosis from the developing world. The accepted number, at the moment, is about 80,000. That's the one that is used for most of the work that's being done on cystic fibrosis.

An alternative approach to augmenting the effect of antibiotics in chronic CF lung infections

Using Machine Learning to Accurately Diagnose Childhood Autism Sooner

Autism is a clinical diagnosis. There is no one test to diagnose autism. Depending on the type of medical professional diagnosing, typically the DSM-V (standard classification of mental disorders used by mental health professionals in the U.S.) is used to diagnose by primary care physicians, neurologists, and psychiatrists.

Using Machine Learning to Accurately Diagnose Childhood Autism Sooner

Newsletters you may be interested in

See all Newsletters »
You might also like... ×
Scientists develop synthetic molecule effective at kicking and killing HIV