|国家预印本平台
首页|基于表征擦除方法的Option-Critic算法研究

基于表征擦除方法的Option-Critic算法研究

Research on Option-Critic algorithm based Representation Erasure

中文摘要英文摘要

Option-Critic(OC)框架能够在不需要任何与环境有关的先验知识的前提下提取可以迁移的抽象知识,端到端地学习选项(一种时间抽象策略)。然而,OC框架在迁移任务中表现出较低的数据效率。在学习过程中,每个选项考虑整个任务的状态空间,从而增加了策略空间搜索的规模。本文提出了基于表征擦除的选项学习算法,通过引入表征擦除方法,清晰地量化了每个维度对高级和低级策略学习的影响,识别和擦除显著干扰训练的维度,从而有效地减少了策略空间搜索的规模。通过理论推导和实验证明,本文论证了基于表征擦除的选项学习算法的有效性。

he Option-Critic (OC) framework can extract transferrable abstract knowledge without requiring any environment-specific prior knowledge, learning options (a form of temporal abstract policy) end-to-end. However, the OC framework exhibits lower data efficiency in transfer tasks. During the learning process, each option considers the entire task's state space, thereby increasing the scale of policy space search. This paper proposes an Option Learning algorithm based on Representation Erasure, which introduces the Representation Erasure method to clearly quantify the influence of each dimension on high-level and low-level policy learning. It identifies and erases dimensions that significantly interfere with training, effectively reducing the scale of policy space search. Through theoretical derivation and experimental validation, this paper demonstrates the effectiveness of the Representation Erasure-based Option Learning algorithm.

孟俊伟、胡铮

计算技术、计算机技术

人工智能迁移学习分层强化学习表征擦除

rtificial IntelligenceTransfer LearningHierarchical Reinforcement LearningRepresentation Erasure

孟俊伟,胡铮.基于表征擦除方法的Option-Critic算法研究[EB/OL].(2024-02-26)[2025-08-02].http://www.paper.edu.cn/releasepaper/content/202402-71.点此复制

评论