Revolutionizing drug discovery: Giga-scale screening unleashes power of AI and virtual libraries

Download PDF Copy

By Pooja Toshniwal PahariaApr 28 2023

In a recent review published in the journal Nature, researchers examined recent breakthroughs in ligand discovery tools, their potential to reshape the drug research and development process, and the hurdles faced.

Computer-assisted technologies for developing drugs have been in use for several years. In recent times, pharma and academia have seen a shift toward embracing computational tools. The transition is facilitated by the abundance of data on ligand characteristics and binding to therapeutic targets, three-dimensional (3D) protein structures, and the emergence of on-demand virtual libraries comprising billions of drug-like small molecules. To fully utilize the resources, rapid computational approaches for effective and rapid giga-scale screening are required.

In the present review, researchers reviewed existing data on computer-assisted approaches in drug discovery and development (DDD).

Study: Computational approaches streamlining drug discovery. Image Credit: angellodeco / Shutterstock

Very-large-scale integration (VLS) technology for identifying high-grade hits

The Protein Data Bank (PDB) comprises >200,000 structures of proteins. High-resolution cryo-electron microscopic imaging and X-rays cover >90% of protein families, and the remaining gaps are filled by AlphaFold2 modeling and/or homology. Chemical spaces used to screen and synthesize potential drug candidates have increased from 10⁷ off-the-shelf molecules to >3.0 x 10¹⁰ molecules synthesized on-demand from 2015 to 2022, with the potential to extend to >10¹⁵ compounds.

In comparison to HTS (10⁵ to 10⁷) and fragment-based ligand discovery (FBLD, 10³ to 10⁵), giga-scale deoxyribonucleic acid (DNA)-encoded libraries (DEL) screening (1010) and giga-scale VLS use considerably larger initial libraries (10¹⁰ to 10¹⁵). The hit rate (%) of HTS and giga-scale DEL screening are similar (0.01 to 0.5), higher for FBLD (1.0 to 5.0), and highest for VLS (10 to 40^a, where a represents the proportion of estimated hits that were confirmed experimentally).

The affinity for initial hits is very weak for FBLD (small fragments sized 100 to 1,000.0 μM), weak (1.0 t 10 μM) for HTS, medium for DEL screening (0.1 to 10 μM), and medium to a high level (0.010 to 10 μM) for VLS. In addition to quantitative structure-activity relationship (QSAR)-based optimization for identifying leads, HTS requires customized synthesis of structure-activity relationships, FBLD requires growing or merging of the fragments, and DEL screening requires resynthesis of label-free hits.

VLS involves quantitative optimization of structure-activity relationships based on catalog structures and requires one-tenth (0.0 to 50) of the number of customized synthesis processes required for HTS, FBLD, and DEL screening to identify leads. Further, HTS and FBLD do not generate novel hits. HTS processes require scaffold hopping or modifications, and FBLD requires rational designs to attain intellectual property (IP) novelty. On the contrary, most VLS hits are novel.

HTS limitations include modest library sizes, unknown modes of binding, and expensive equipment; FBLD limitations include the need for expensive equipment for nuclear magnetic resonance (NMR), surface plasmon resonance (SPR), and X-ray imaging, as well as many optimization steps; DEL screening results in several false positives and requires off-deoxyribonucleic acid hit resynthesis. VLS requires computational resources, which have been reduced using modular-type VLS by >1,000-fold.

Virtual screening algorithms are based on protein structures, ligands, or both. Protein-based algorithms require high-resolution structures, whereas ligand-based ones require large datasets for ligand activity. Hybrid screening requires data on ligand activity and protein-ligand 3D complexes to generate three-dimensional interaction fingerprints and artificial intelligence (AI)-based models.

Chemical library types and computational-driven technology to streamline the discovery of drugs

Pharma firms in-house screen enormous numbers of compounds, whereas collections from vendors allow for rapid (<1.0 week) delivery of in-stock molecules featuring unique chemical-type scaffolds that can be searched easily and are compatible with high-throughput screening (HTS). However, the cost of managing physical drug libraries, their slow growth, and their small size limit their applicability.

On-demand REAL and chemical spaces enable rapid parallel synthesis of on-demand molecules from >12,000 building blocks undergoing >180 reactions, with a success rate of >80.0% and delivery within 2.0 to 3.0 weeks. Examples include Galaxy by WuXi, Enamine REAL, and CHEMriya by Otava. Including additional synthons (e.g., using the V-SYNTHES algorithm) and reaction scaffolds enables high novelty and rapid polynomial growth for virtual chemical space-based drug development.

The V-SYNTHES algorithm can be used to effectively screen >31 billion compounds, including >3.0 x10¹⁰ compounds from REAL space and >10¹⁵compounds from expanded chemical spaces, by fully enumerating molecules that optimally fit the target pocket. Generative spaces (GDB-13,17,18, and GDBChEMBL) include all theoretically conceivable molecules and chemical spaces. Only theoretical-type plausibility, predicted at 1,023 to 1,060 drug-like molecules, limits such realms.

Despite providing broad coverage of spaces, the success rates and reactionary pathways of the compounds produced are not known, warranting computational estimation of their ability to synthesize drug candidates. In generative spaces, atomic graphs are used to generate saturated hydrocarbon structures and skeletons comprising unsaturated molecules. The skeletons are expanded by heteroatom substitution and converted into meaningful compounds.

Computationally driven drug discovery is based on easily accessible on-demand or generative virtual chemical spaces, as well as structure-based and AI-based computational tools that streamline the drug discovery process. In comparison to the standard gene-to-lead discovery timeline of four to six years, computationally driven technology can identify potential drug candidates within 2.0 to 12 months.

Using rapid, flexible docking, deep learning, or scoring approaches with higher accuracy post-processing tools based on quantum mechanics and free energy perturbation (FEP) can increase high-affinity hits for giga-scale chemical spaces. In addition, rapidly expanding low-cost cloud computing, specialized chips, and graphics processing unit (GPU) acceleration also aid computational tools.

Based on the review findings, the DDD ecosystem seems to be transforming from computer-aided to computer-driven for rapid and cost-effective drug discovery using elaborate potency prediction tools and potent and selective leads. However, computational estimations require validation by performing in vitro and in vivo experiments at each step of the drug discovery pipeline.

Journal reference:

Sadybekov, A.V., Katritch, V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023). DOI: https://doi.org/10.1038/s41586-023-05905-z, https://www.nature.com/articles/s41586-023-05905-z

Posted in: Drug Discovery & Pharmaceuticals | Device / Technology News | Medical Science News | Medical Research News | Pharmaceutical News

Comments (0)

Written by

Pooja Toshniwal Paharia

Pooja Toshniwal Paharia is an oral and maxillofacial physician and radiologist based in Pune, India. Her academic background is in Oral Medicine and Radiology. She has extensive experience in research and evidence-based clinical-radiological diagnosis and management of oral lesions and conditions and associated maxillofacial disorders.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Toshniwal Paharia, Pooja Toshniwal Paharia. (2023, April 28). Revolutionizing drug discovery: Giga-scale screening unleashes power of AI and virtual libraries. News-Medical. Retrieved on February 09, 2026 from https://www.news-medical.net/news/20230428/Revolutionizing-drug-discovery-Giga-scale-screening-unleashes-power-of-AI-and-virtual-libraries.aspx.
MLA
Toshniwal Paharia, Pooja Toshniwal Paharia. "Revolutionizing drug discovery: Giga-scale screening unleashes power of AI and virtual libraries". News-Medical. 09 February 2026. <https://www.news-medical.net/news/20230428/Revolutionizing-drug-discovery-Giga-scale-screening-unleashes-power-of-AI-and-virtual-libraries.aspx>.
Chicago
Toshniwal Paharia, Pooja Toshniwal Paharia. "Revolutionizing drug discovery: Giga-scale screening unleashes power of AI and virtual libraries". News-Medical. https://www.news-medical.net/news/20230428/Revolutionizing-drug-discovery-Giga-scale-screening-unleashes-power-of-AI-and-virtual-libraries.aspx. (accessed February 09, 2026).
Harvard
Toshniwal Paharia, Pooja Toshniwal Paharia. 2023. Revolutionizing drug discovery: Giga-scale screening unleashes power of AI and virtual libraries. News-Medical, viewed 09 February 2026, https://www.news-medical.net/news/20230428/Revolutionizing-drug-discovery-Giga-scale-screening-unleashes-power-of-AI-and-virtual-libraries.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.