|国家预印本平台
首页|Bayesian Selection for Efficient MLIP Dataset Selection

Bayesian Selection for Efficient MLIP Dataset Selection

Bayesian Selection for Efficient MLIP Dataset Selection

来源:Arxiv_logoArxiv
英文摘要

The problem of constructing a dataset for MLIP development which gives the maximum quality in the minimum amount of compute time is complex, and can be approached in a number of ways. We introduce a ``Bayesian selection" approach for selecting from a candidate set of structures, and compare the effectiveness of this method against other common approaches in the task of constructing ideal datasets targeting Silicon surface energies. We show that the Bayesian selection method performs much better than Simple Random Sampling at this task (for example, the error on the (100) surface energy is 4.3x lower in the low data regime), and is competitive with a variety of existing selection methods, using ACE and MACE features.

Thomas Rocke、James Kermode

计算技术、计算机技术

Thomas Rocke,James Kermode.Bayesian Selection for Efficient MLIP Dataset Selection[EB/OL].(2025-06-19)[2025-07-16].https://arxiv.org/abs/2502.21165.点此复制

评论