ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.05751
98
1
v1v2 (latest)

SenseShift6D: Multimodal RGB-D Benchmarking for Robust 6D Pose Estimation across Environment and Sensor Variations

8 July 2025
Yegyu Han
Taegyoon Yoon
Dayeon Woo
Sojeong Kim
Hyung-Sin Kim
ArXiv (abs)PDFHTMLGithub
Main:9 Pages
11 Figures
Bibliography:3 Pages
14 Tables
Appendix:7 Pages
Abstract

Recent advances on 6D object-pose estimation have achieved high performance on representative benchmarks such as LM-O, YCB-V, and T-Less. However, these datasets were captured under fixed illumination and camera settings, leaving the impact of real-world variations in illumination, exposure, gain or depth-sensor mode-and the potential of test-time sensor control to mitigate such variations-largely unexplored. To bridge this gap, we introduce SenseShift6D, the first RGB-D dataset that physically sweeps 13 RGB exposures, 9 RGB gains, auto-exposure, 4 depth-capture modes, and 5 illumination levels. For five common household objects (spray, pringles, tincase, sandwich, and mouse), we acquire 166.4k RGB and 16.7k depth images, which can provide 1,380 unique sensor-lighting permutations per object pose. Experiments with state-of-the-art models on our dataset demonstrate that applying multimodal sensor control at test time yields substantial performance gains, achieving a 19.5 pp improvement on pretrained generalizable models. It also enhances robustness precisely where those models tend to fail. Moreover, even instance-level pose estimators, where train and test set share identical object and background, performance still varies under environmental and sensor change, demonstrating that test-time sensor control remains effective compared to costly expansions in the quantity and diversity of real-world training data, without any additional training. SenseShift6D extends the object pose evaluation paradigm from data-centered to sensor-aware robustness, laying a foundation for adaptive, self-tuning perception systems capable of operating robustly in uncertain real-world environments.

View on arXiv
Comments on this paper