Integrating physicochemical laws into future AI models for better drug design

Download PDF Copy

Reviewed

University of BaselOct 30 2025

Proteins play a key role not only in the body, but also in medicine: they either serve as active ingredients, such as enzymes or antibodies, or they are target structures for drugs. The first step in developing new therapies is therefore usually to decipher the three-dimensional structure of proteins.

For a long time, elucidating protein structures was a highly complex endeavor, until machine learning found its way into protein research. AI models with names such as AlphaFold or RosettaFold have ushered in a new era: they calculate how the chain of protein building blocks, known as amino acids, folds into a three-dimensional structure. In 2024, the developers of these programs received the Nobel Prize in Chemistry.

Suspiciously high success rate

The latest versions of these programs go one step further: they calculate how the protein in question interacts with another molecule – a docking partner or "ligand", as experts call it. This could be an active pharmaceutical ingredient, for example.

This possibility of predicting the structure of proteins together with a ligand is invaluable for drug development."

Professor Markus Lill, University of Basel

Together with his team at the Department of Pharmaceutical Sciences, he researches methods for designing active pharmaceutical ingredients.

However, the apparently high success rates for the structural prediction puzzled Lill and his staff. Especially as there are only around 100,000 already elucidated protein structures together with their ligands available for training the AI models – relatively few compared to other training data sets for AI. "We wanted to find out whether these AI models really learn the basics of physical chemistry using the training data and apply them correctly," says Lill.

Same prediction for significantly altered binding sites

The researchers modified the amino acid sequence of hundreds of sample proteins in such a way that the binding sites for their ligands exhibited a completely different charge distribution or were even blocked entirely. Nevertheless, the AI models predicted the same complex structure – as if binding were still possible. The researchers pursued a similar approach with the ligands: they modified them in such a way that they would no longer be able to dock to the protein in question. This did not bother the AI models either.

In more than half of the cases, the models predicted the structure as if the interferences in the amino acid sequence had never occurred. "This shows us that even the most advanced AI models do not really understand why a drug binds to a protein; they only recognize patterns that they have seen before," says Lill.

Unknown proteins are particularly difficult

The AI models faced particular difficulties if the proteins did not show any similarity to the training data sets. "When they see something completely new, they quickly fall short, but that is precisely where the key to new drugs lies," emphasizes Markus Lill.

AI models should therefore be viewed with caution when it comes to drug development. It is important to validate the predictions of the models using experiments or computer-aided analyses that actually take the physicochemical properties into account. The researchers also used these methods to examine the results of the AI models in the course of their study.

"The better solution would be to integrate the physicochemical laws into future AI models," says Lill. With their more realistic structural predictions, these could then provide a better basis for the development of new drugs, especially for protein structures that have so far been difficult to elucidate, and would open up the possibility of completely new therapeutic approaches.

Source:

University of Basel

Journal reference:

Masters, M. R., et al. (2025). Investigating whether deep learning models for co-folding learn the physics of protein-ligand interactions. Nature Communications. doi.org/10.1038/s41467-025-63947-5

Posted in: Drug Discovery & Pharmaceuticals | Device / Technology News | Medical Science News