Large Language Model Agent for Modular Task Execution in Drug Discovery
Large Language Model Agent for Modular Task Execution in Drug Discovery
We present a modular framework powered by large language models (LLMs) that automates and streamlines key tasks across the early-stage computational drug discovery pipeline. By combining LLM reasoning with domain-specific tools, the framework performs biomedical data retrieval, domain-specific question answering, molecular generation, property prediction, property-aware molecular refinement, and 3D protein-ligand structure generation. In a case study targeting BCL-2 in lymphocytic leukemia, the agent autonomously retrieved relevant biomolecular information-including FASTA sequences, SMILES representations, and literature-and answered mechanistic questions with improved contextual accuracy over standard LLMs. It then generated chemically diverse seed molecules and predicted 67 ADMET-related properties, which guided iterative molecular refinement. Across two refinement rounds, the number of molecules with QED > 0.6 increased from 34 to 55, and those passing at least four out of five empirical drug-likeness rules rose from 29 to 52, within a pool of 194 molecules. The framework also employed Boltz-2 to generate 3D protein-ligand complexes and provide rapid binding affinity estimates for candidate compounds. These results demonstrate that the approach effectively supports molecular screening, prioritization, and structure evaluation. Its modular design enables flexible integration of evolving tools and models, providing a scalable foundation for AI-assisted therapeutic discovery.
Janghoon Ock、Radheesh Sharma Meda、Srivathsan Badrinarayanan、Neha S. Aluru、Achuth Chandrasekhar、Amir Barati Farimani
药学生物科学研究方法、生物科学研究技术
Janghoon Ock,Radheesh Sharma Meda,Srivathsan Badrinarayanan,Neha S. Aluru,Achuth Chandrasekhar,Amir Barati Farimani.Large Language Model Agent for Modular Task Execution in Drug Discovery[EB/OL].(2025-06-26)[2025-07-16].https://arxiv.org/abs/2507.02925.点此复制
评论