|国家预印本平台
首页|Ghost Policies: A New Paradigm for Understanding and Learning from Failure in Deep Reinforcement Learning

Ghost Policies: A New Paradigm for Understanding and Learning from Failure in Deep Reinforcement Learning

Ghost Policies: A New Paradigm for Understanding and Learning from Failure in Deep Reinforcement Learning

来源:Arxiv_logoArxiv
英文摘要

Deep Reinforcement Learning (DRL) agents often exhibit intricate failure modes that are difficult to understand, debug, and learn from. This opacity hinders their reliable deployment in real-world applications. To address this critical gap, we introduce ``Ghost Policies,'' a concept materialized through Arvolution, a novel Augmented Reality (AR) framework. Arvolution renders an agent's historical failed policy trajectories as semi-transparent ``ghosts'' that coexist spatially and temporally with the active agent, enabling an intuitive visualization of policy divergence. Arvolution uniquely integrates: (1) AR visualization of ghost policies, (2) a behavioural taxonomy of DRL maladaptation, (3) a protocol for systematic human disruption to scientifically study failure, and (4) a dual-learning loop where both humans and agents learn from these visualized failures. We propose a paradigm shift, transforming DRL agent failures from opaque, costly errors into invaluable, actionable learning resources, laying the groundwork for a new research field: ``Failure Visualization Learning.''

Xabier Olaz

计算技术、计算机技术

Xabier Olaz.Ghost Policies: A New Paradigm for Understanding and Learning from Failure in Deep Reinforcement Learning[EB/OL].(2025-06-14)[2025-06-23].https://arxiv.org/abs/2506.12366.点此复制

评论