首页|An Optimisation Framework for Unsupervised Environment Design

An Optimisation Framework for Unsupervised Environment Design

来源：

英文摘要

For reinforcement learning agents to be deployed in high-risk settings, they must achieve a high level of robustness to unfamiliar scenarios. One method for improving robustness is unsupervised environment design (UED), a suite of methods aiming to maximise an agent's generalisability across configurations of an environment. In this work, we study UED from an optimisation perspective, providing stronger theoretical guarantees for practical settings than prior work. Whereas previous methods relied on guarantees if they reach convergence, our framework employs a nonconvex-strongly-concave objective for which we provide a provably convergent algorithm in the zero-sum setting. We empirically verify the efficacy of our method, outperforming prior methods in a number of environments with varying difficulties.

作者：Nathan Monette、Alistair Letcher、Michael Beukman、Matthew T. Jackson、Alexander Rutherford、Alexander D. Goldie、Jakob N. Foerster

作者单位：

学科分类：自动化基础理论计算技术、计算机技术

推荐引用：Nathan Monette,Alistair Letcher,Michael Beukman,Matthew T. Jackson,Alexander Rutherford,Alexander D. Goldie,Jakob N. Foerster.An Optimisation Framework for Unsupervised Environment Design[EB/OL].(2025-05-26)[2025-06-30].https://arxiv.org/abs/2505.20659.点此复制

An Optimisation Framework for Unsupervised Environment Design

An Optimisation Framework for Unsupervised Environment Design

评论