|国家预印本平台
首页|NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries

NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries

NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries

来源:Arxiv_logoArxiv
英文摘要

We present NEBULA, the first latent 3D generative model for scalable generation of large molecular libraries around a seed compound of interest. Such libraries are crucial for scientific discovery, but it remains challenging to generate large numbers of high quality samples efficiently. 3D-voxel-based methods have recently shown great promise for generating high quality samples de novo from random noise (Pinheiro et al., 2023). However, sampling in 3D-voxel space is computationally expensive and use in library generation is prohibitively slow. Here, we instead perform neural empirical Bayes sampling (Saremi & Hyvarinen, 2019) in the learned latent space of a vector-quantized variational autoencoder. NEBULA generates large molecular libraries nearly an order of magnitude faster than existing methods without sacrificing sample quality. Moreover, NEBULA generalizes better to unseen drug-like molecules, as demonstrated on two public datasets and multiple recently released drugs. We expect the approach herein to be highly enabling for machine learning-based drug discovery. The code is available at https://github.com/prescient-design/nebula

Andrew Martin Watkins、Pedro O. Pinheiro、Michael Maser、Sai Pooja Mahajan、Omar Mahmood、Ewa M. Nowara、Saeed Saremi

生物科学研究方法、生物科学研究技术药学计算技术、计算机技术

Andrew Martin Watkins,Pedro O. Pinheiro,Michael Maser,Sai Pooja Mahajan,Omar Mahmood,Ewa M. Nowara,Saeed Saremi.NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries[EB/OL].(2024-07-03)[2025-05-13].https://arxiv.org/abs/2407.03428.点此复制

评论