|国家预印本平台
首页|AugLift: Boosting Generalization in Lifting-based 3D Human Pose Estimation

AugLift: Boosting Generalization in Lifting-based 3D Human Pose Estimation

AugLift: Boosting Generalization in Lifting-based 3D Human Pose Estimation

来源:Arxiv_logoArxiv
英文摘要

Lifting-based methods for 3D Human Pose Estimation (HPE), which predict 3D poses from detected 2D keypoints, often generalize poorly to new datasets and real-world settings. To address this, we propose \emph{AugLift}, a simple yet effective reformulation of the standard lifting pipeline that significantly improves generalization performance without requiring additional data collection or sensors. AugLift sparsely enriches the standard input -- the 2D keypoint coordinates $(x, y)$ -- by augmenting it with a keypoint detection confidence score $c$ and a corresponding depth estimate $d$. These additional signals are computed from the image using off-the-shelf, pre-trained models (e.g., for monocular depth estimation), thereby inheriting their strong generalization capabilities. Importantly, AugLift serves as a modular add-on and can be readily integrated into existing lifting architectures. Our extensive experiments across four datasets demonstrate that AugLift boosts cross-dataset performance on unseen datasets by an average of $10.1\%$, while also improving in-distribution performance by $4.0\%$. These gains are consistent across various lifting architectures, highlighting the robustness of our method. Our analysis suggests that these sparse, keypoint-aligned cues provide robust frame-level context, offering a practical way to significantly improve the generalization of any lifting-based pose estimation model. Code will be made publicly available.

Nikolai Warner、Wenjin Zhang、Irfan Essa、Apaar Sadhwani

计算技术、计算机技术

Nikolai Warner,Wenjin Zhang,Irfan Essa,Apaar Sadhwani.AugLift: Boosting Generalization in Lifting-based 3D Human Pose Estimation[EB/OL].(2025-08-16)[2025-08-24].https://arxiv.org/abs/2508.07112.点此复制

评论