|国家预印本平台
首页|Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

来源:Arxiv_logoArxiv
英文摘要

Vision is well-known for its use in manipulation, especially using visual servoing. To make it robust, multiple cameras are needed to expand the field of view. That is computationally challenging. Merging multiple views and using Q-learning allows the design of more effective representations and optimization of sample efficiency. Such a solution might be expensive to deploy. To mitigate this, we introduce a Merge And Disentanglement (MAD) algorithm that efficiently merges views to increase sample efficiency while augmenting with single-view features to allow lightweight deployment and ensure robust policies. We demonstrate the efficiency and robustness of our approach using Meta-World and ManiSkill3. For project website and code, see https://aalmuzairee.github.io/mad

Abdulaziz Almuzairee、Rohan Patil、Dwait Bhatt、Henrik I. Christensen

自动化技术、自动化技术设备

Abdulaziz Almuzairee,Rohan Patil,Dwait Bhatt,Henrik I. Christensen.Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation[EB/OL].(2025-05-07)[2025-05-31].https://arxiv.org/abs/2505.04619.点此复制

评论