|国家预印本平台
首页|Ag2x2: Robust Agent-Agnostic Visual Representations for Zero-Shot Bimanual Manipulation

Ag2x2: Robust Agent-Agnostic Visual Representations for Zero-Shot Bimanual Manipulation

Ag2x2: Robust Agent-Agnostic Visual Representations for Zero-Shot Bimanual Manipulation

来源:Arxiv_logoArxiv
英文摘要

Bimanual manipulation, fundamental to human daily activities, remains a challenging task due to its inherent complexity of coordinated control. Recent advances have enabled zero-shot learning of single-arm manipulation skills through agent-agnostic visual representations derived from human videos; however, these methods overlook crucial agent-specific information necessary for bimanual coordination, such as end-effector positions. We propose Ag2x2, a computational framework for bimanual manipulation through coordination-aware visual representations that jointly encode object states and hand motion patterns while maintaining agent-agnosticism. Extensive experiments demonstrate that Ag2x2 achieves a 73.5% success rate across 13 diverse bimanual tasks from Bi-DexHands and PerAct2, including challenging scenarios with deformable objects like ropes. This performance outperforms baseline methods and even surpasses the success rate of policies trained with expert-engineered rewards. Furthermore, we show that representations learned through Ag2x2 can be effectively leveraged for imitation learning, establishing a scalable pipeline for skill acquisition without expert supervision. By maintaining robust performance across diverse tasks without human demonstrations or engineered rewards, Ag2x2 represents a step toward scalable learning of complex bimanual robotic skills.

Ziyin Xiong、Yinghan Chen、Puhao Li、Yixin Zhu、Tengyu Liu、Siyuan Huang

计算技术、计算机技术

Ziyin Xiong,Yinghan Chen,Puhao Li,Yixin Zhu,Tengyu Liu,Siyuan Huang.Ag2x2: Robust Agent-Agnostic Visual Representations for Zero-Shot Bimanual Manipulation[EB/OL].(2025-07-26)[2025-08-10].https://arxiv.org/abs/2507.19817.点此复制

评论