首页|Experimenting, Fast and Slow: Bayesian Optimization of Long-term Outcomes with Online Experiments

Experimenting, Fast and Slow: Bayesian Optimization of Long-term Outcomes with Online Experiments

来源：

英文摘要

Online experiments in internet systems, also known as A/B tests, are used for a wide range of system tuning problems, such as optimizing recommender system ranking policies and learning adaptive streaming controllers. Decision-makers generally wish to optimize for long-term treatment effects of the system changes, which often requires running experiments for a long time as short-term measurements can be misleading due to non-stationarity in treatment effects over time. The sequential experimentation strategies--which typically involve several iterations--can be prohibitively long in such cases. We describe a novel approach that combines fast experiments (e.g., biased experiments run only for a few hours or days) and/or offline proxies (e.g., off-policy evaluation) with long-running, slow experiments to perform sequential, Bayesian optimization over large action spaces in a short amount of time.

作者：Benjamin Letham、Maximilian Balandat、Eytan Bakshy、Qing Feng、Samuel Daulton

作者单位：

DOI：10.1145/3690624.3709419

学科分类：计算技术、计算机技术

推荐引用：Benjamin Letham,Maximilian Balandat,Eytan Bakshy,Qing Feng,Samuel Daulton.Experimenting, Fast and Slow: Bayesian Optimization of Long-term Outcomes with Online Experiments[EB/OL].(2025-06-30)[2025-07-09].https://arxiv.org/abs/2506.18744.点此复制

Experimenting, Fast and Slow: Bayesian Optimization of Long-term Outcomes with Online Experiments

Experimenting, Fast and Slow: Bayesian Optimization of Long-term Outcomes with Online Experiments

评论