Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective
Distillation and Unlabeled Data AugmentationNeural Information Processing Systems (NeurIPS), 2024 |
Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language
NavigationConference on Robot Learning (CoRL), 2024 |
Pandora: Towards General World Model with Natural Language Actions and
Video States Jiannan Xiang Guangyi Liu Yi Gu Qiyue Gao Yuting Ning ...Shibo Hao Yemin Shi Zhengzhong Liu Eric P. Xing Zhiting Hu |
EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks
with Large Vision-Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
Omni6DPose: A Benchmark and Model for Universal 6D Object Pose
Estimation and TrackingEuropean Conference on Computer Vision (ECCV), 2024 |