HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images
HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images
Hand-object 3D reconstruction has become increasingly important for applications in human-robot interaction and immersive AR/VR experiences. A common approach for object-agnostic hand-object reconstruction from RGB sequences involves a two-stage pipeline: hand-object 3D tracking followed by multi-view 3D reconstruction. However, existing methods rely on keypoint detection techniques, such as Structure from Motion (SfM) and hand-keypoint optimization, which struggle with diverse object geometries, weak textures, and mutual hand-object occlusions, limiting scalability and generalization. As a key enabler to generic and seamless, non-intrusive applicability, we propose in this work a robust, keypoint detector-free approach to estimating hand-object 3D transformations from monocular motion video/images. We further integrate this with a multi-view reconstruction pipeline to accurately recover hand-object 3D shape. Our method, named HOSt3R, is unconstrained, does not rely on pre-scanned object templates or camera intrinsics, and reaches state-of-the-art performance for the tasks of object-agnostic hand-object 3D transformation and shape estimation on the SHOWMe benchmark. We also experiment on sequences from the HO3D dataset, demonstrating generalization to unseen object categories.
Anilkumar Swamy、Vincent Leroy、Philippe Weinzaepfel、Jean-Sébastien Franco、Grégory Rogez
计算技术、计算机技术
Anilkumar Swamy,Vincent Leroy,Philippe Weinzaepfel,Jean-Sébastien Franco,Grégory Rogez.HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images[EB/OL].(2025-08-25)[2025-09-06].https://arxiv.org/abs/2508.16465.点此复制
评论