首页|Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

来源：

英文摘要

Vision is well-known for its use in manipulation, especially using visual servoing. To make it robust, multiple cameras are needed to expand the field of view. That is computationally challenging. Merging multiple views and using Q-learning allows the design of more effective representations and optimization of sample efficiency. Such a solution might be expensive to deploy. To mitigate this, we introduce a Merge And Disentanglement (MAD) algorithm that efficiently merges views to increase sample efficiency while augmenting with single-view features to allow lightweight deployment and ensure robust policies. We demonstrate the efficiency and robustness of our approach using Meta-World and ManiSkill3. For project website and code, see https://aalmuzairee.github.io/mad

作者：Abdulaziz Almuzairee、Rohan Patil、Dwait Bhatt、Henrik I. Christensen

作者单位：

学科分类：自动化技术、自动化技术设备

推荐引用：Abdulaziz Almuzairee,Rohan Patil,Dwait Bhatt,Henrik I. Christensen.Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation[EB/OL].(2025-05-07)[2025-05-31].https://arxiv.org/abs/2505.04619.点此复制

Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

评论