首页|Uncertainty Sets for Distributionally Robust Bandits Using Structural Equation Models

Uncertainty Sets for Distributionally Robust Bandits Using Structural Equation Models

来源：

英文摘要

Distributionally robust evaluation estimates the worst-case expected return over an uncertainty set of possible covariate and reward distributions, and distributionally robust learning finds a policy that maximizes that worst-case return across that uncertainty set. Unfortunately, current methods for distributionally robust evaluation and learning create overly conservative evaluations and policies. In this work, we propose a practical bandit evaluation and learning algorithm that tailors the uncertainty set to specific problems using mathematical programs constrained by structural equation models. Further, we show how conditional independence testing can be used to detect shifted variables for modeling. We find that the structural equation model (SEM) approach gives more accurate evaluations and learns lower-variance policies than traditional approaches, particularly for large shifts. Further, the SEM approach learns an optimal policy, assuming the model is sufficiently well-specified.

作者：Katherine Avery、Chinmay Pendse、David Jensen

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Katherine Avery,Chinmay Pendse,David Jensen.Uncertainty Sets for Distributionally Robust Bandits Using Structural Equation Models[EB/OL].(2025-08-04)[2025-08-16].https://arxiv.org/abs/2508.02812.点此复制

Uncertainty Sets for Distributionally Robust Bandits Using Structural Equation Models

Uncertainty Sets for Distributionally Robust Bandits Using Structural Equation Models

评论