Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding
Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding
Empowering safe exploration of reinforcement learning (RL) agents during training is a critical challenge towards their deployment in many real-world scenarios. When prior knowledge of the domain or task is unavailable, training RL agents in unknown, \textit{black-box} environments presents an even greater safety risk. We introduce \mbox{ADVICE} (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training, and uses this knowledge to protect the RL agent from executing actions that yield likely hazardous outcomes. Our comprehensive experimental evaluation against state-of-the-art safe RL exploration techniques shows that ADVICE significantly reduces safety violations ($\approx\!\!50\%$) during training, with a competitive outcome reward compared to other techniques.
Radu Calinescu、Simos Gerasimou、Daniel Bethell、Calum Imrie
安全科学计算技术、计算机技术自动化技术、自动化技术设备
Radu Calinescu,Simos Gerasimou,Daniel Bethell,Calum Imrie.Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding[EB/OL].(2024-05-28)[2025-08-02].https://arxiv.org/abs/2405.18180.点此复制
评论