|国家预印本平台
首页|TAPIP3D: Tracking Any Point in Persistent 3D Geometry

TAPIP3D: Tracking Any Point in Persistent 3D Geometry

TAPIP3D: Tracking Any Point in Persistent 3D Geometry

来源:Arxiv_logoArxiv
英文摘要

We introduce TAPIP3D, a novel approach for long-term 3D point tracking in monocular RGB and RGB-D videos. TAPIP3D represents videos as camera-stabilized spatio-temporal feature clouds, leveraging depth and camera motion information to lift 2D video features into a 3D world space where camera movement is effectively canceled out. Within this stabilized 3D representation, TAPIP3D iteratively refines multi-frame motion estimates, enabling robust point tracking over long time horizons. To handle the irregular structure of 3D point distributions, we propose a 3D Neighborhood-to-Neighborhood (N2N) attention mechanism - a 3D-aware contextualization strategy that builds informative, spatially coherent feature neighborhoods to support precise trajectory estimation. Our 3D-centric formulation significantly improves performance over existing 3D point tracking methods and even surpasses state-of-the-art 2D pixel trackers in accuracy when reliable depth is available. The model supports inference in both camera-centric (unstabilized) and world-centric (stabilized) coordinates, with experiments showing that compensating for camera motion leads to substantial gains in tracking robustness. By replacing the conventional 2D square correlation windows used in prior 2D and 3D trackers with a spatially grounded 3D attention mechanism, TAPIP3D achieves strong and consistent results across multiple 3D point tracking benchmarks. Project Page: https://tapip3d.github.io

Bowei Zhang、Lei Ke、Adam W. Harley、Katerina Fragkiadaki

计算技术、计算机技术

Bowei Zhang,Lei Ke,Adam W. Harley,Katerina Fragkiadaki.TAPIP3D: Tracking Any Point in Persistent 3D Geometry[EB/OL].(2025-04-20)[2025-06-12].https://arxiv.org/abs/2504.14717.点此复制

评论