In a recent study published in Nature Communications, researchers developed a scoring mechanism based on artificial intelligence for early drug discovery campaigns that might be utilized for compound prioritizing, motif rationalization, and biased drug design.
In drug development campaigns, lead optimization entails the time-consuming process of working among several chemists to attain targeted molecular property profiles. Chemists gain experience in areas such as compound prioritization, which enables them to make more efficient judgments. Researchers have explored rule-based techniques and fundamental cheminformatics desirability rankings, but capturing the complexities has proven difficult. Medicinal chemistry, like a human enterprise, is sensitive to subjective biases.
About the study
In the present study, researchers investigated the feasibility of turning medicinal chemists' knowledge into machine-learning models for lead optimization and other drug discovery pipeline choices.
By studying chemical pairings, the researchers created a machine-learning model that could learn from the preferences of 35 medicinal chemists. The model employed a paired learning-to-rank experimental design among molecules, with participants given a straightforward cue to select their preferred compounds.
There were numerous rounds in the study, including two rounds of preliminary analysis with 220 molecular pairs and a production run with nearly 5,000 replies. The inter-rater agreement (i.e., the degree to which one chemist's selections agree with peer selections) was tested using 200 distinct chemical pairings, which intuitively was a straightforward indication of whether an artificial intelligence-based model could learn a signal.
Furthermore, the researchers investigated molecular selection bias based on molecular positions on the screen (right or left) during annotation. The model was trained on a collection of compounds retrieved from the ChEMBL database, with molecular weights and drug-likeness (QED) ranging between 200 and 1,000 g mol-1, and it permitted up to two rule-of-five violations.
The compounds were standardized by removing salt, normalizing tautomers, and neutralizing atoms before being utilized in a preference learning issue. For the subsequent preliminary research round and following manufacturing rounds, the Novartis Institutes for BioMedical Research (NIBR) substructure filters were used, resulting in a 1,831,052-molecule pool. Fragment analysis on various chemicals rationalized model learning.
After each labeled batch of 1,000 data points, the prediction performance of the model was evaluated using the area under the receiver-operating characteristic (AUROC) curve values and randomized fivefold cross-validation.
A strategy similar to the one published in the original QED study was employed to assess whether the learned scores might be used to deprioritize undesired substances. The researchers generated 500 molecules by maximizing and decreasing the learned scoring function using the pre-trained SMILES-based Long Short-Term Memory (LSTM) generative model and the hill-climbing optimization approach. This technique aims to overcome prior research's cognitive bias constraints and increase the effectiveness of machine learning models in the pharmaceutical business.
The data revealed a moderate concordance between the chemists' choices given in the early rounds. Cross-validation findings revealed a consistent increase in accurately classifying pairs performance with increasing data availability, with AUROC values ranging between 0.6 and 0.74 at the 1,000 and 5,000 available pair thresholds, respectively.
The study used implicit scoring systems to build a novel strategy for predicting drug resemblance in drug design. The technique was more accurate than the commonly used QED measure, created from internal comments over years of experience.
The algorithm could accurately learn medicinal chemists' preferences, distinguishing features such as drug-likeness, fingerprint density, and the proportion of allylic oxidation sites. QED was the most associated descriptor, followed by fingerprint density, allylic oxidation regions, atomic contributions to van der Waals surface area, and Hall-Kier kappa values.
With varying kinds of fingerprint densities available, the model could detect higher compounds feature-wise, indicating that the chemists favored higher molecules characteristic-wise.
However, there was a minor positive association with the score measure, indicating that the suggested score preferred synthetically simpler molecules. The SMR VSA3 descriptor measured molecular surface area aggregated using Wildman-Crippen MR values and was modestly negatively correlated, showing that chemists favored compounds with neutral atoms of nitrogen.
For FDA-approved pharmaceuticals and GDB collections, the filtering method yielded 732 and 8,616 examined compounds, respectively. Compared to the GDB set, the distribution of learned scores was well split across sets that better depicted drug-like space [i.e., Drugbank Food and Drug Administration (FDA)-approved pharmaceuticals and ChEMBL].
QED scores were difficult to distinguish between the three sets. Common medicinal chemistry motifs such as pyrazines, pyrimidines, sulfones, imidazoles, oxadiazoles, phenyls, and bicyclic heteroaromatics were among the best-ranked. Compounds with long flexible-type chains, double bond conjugations, unusual groups, reactive components, or more alcohols and carboxylates received excellent marks.
Minimalizing the scoring function, on the other hand, resulted in a significant mixture of aliphatic sp3-type carbons and aromatic rings, suitably sized fragments, and characteristic groups seen in drug-resembling compounds. The high quality of the produced compounds revealed that the scoring model function was highly relevant for de novo drug creation.
Overall, the study findings showed that the latent score machine-learning algorithm might gain medicinal chemists' knowledge, delivering more information on in silico ligand-based attributes or fragment definitions. This method might be used in ordinary cheminformatics activities such as deprioritizing molecules not detected by rule-based techniques or biased molecular design.
- Oh-Hyeon Choung, Riccardo Vianello, Marwin Segler, Nikolaus Stiefl, and José Jiménez-Luna, Extracting medicinal chemistry intuition via preference machine learning, Nature Communications, (2023)14:6651 doi: https://doi.org/10.1038/s41467-023-42242-1