201

AugLift: Boosting Generalization in Lifting-based 3D Human Pose Estimation

Main:8 Pages
9 Figures
Bibliography:1 Pages
15 Tables
Appendix:6 Pages
Abstract

Lifting-based methods for 3D Human Pose Estimation (HPE), which predict 3D poses from detected 2D keypoints, often generalize poorly to new datasets and real-world settings. To address this, we propose \emph{AugLift}, a simple yet effective reformulation of the standard lifting pipeline that significantly improves generalization performance without requiring additional data collection or sensors. AugLift sparsely enriches the standard input -- the 2D keypoint coordinates (x,y)(x, y) -- by augmenting it with a keypoint detection confidence score cc and a corresponding depth estimate dd. These additional signals are computed from the image using off-the-shelf, pre-trained models (e.g., for monocular depth estimation), thereby inheriting their strong generalization capabilities. Importantly, AugLift serves as a modular add-on and can be readily integrated into existing lifting architectures.

View on arXiv
Comments on this paper