Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2109.08238
Cited By
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
16 September 2021
Santhosh Kumar Ramakrishnan
Aaron Gokaslan
Erik Wijmans
Oleksandr Maksymets
Alexander Clegg
John Turner
Eric Undersander
Wojciech Galuba
Andrew Westbury
Angel X. Chang
Manolis Savva
Yili Zhao
Dhruv Batra
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI"
50 / 373 papers shown
Title
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
Computer Vision and Pattern Recognition (CVPR), 2025
Mingfei Chen
I. D. Gebru
Ishwarya Ananthabhotla
Christian Richardt
Dejan Marković
Jake Sandakly
Steven Krenn
Todd Keebler
Eli Shlizerman
Alexander Richard
240
2
0
08 Apr 2025
Dexterous Manipulation through Imitation Learning: A Survey
Shan An
Ziyu Meng
Chao Tang
Yimiao Zhou
Tengyu Liu
...
Yao Mu
Ran Song
Wei Zhang
Zeng-Guang Hou
Haoyang Zhang
469
11
0
04 Apr 2025
Visual Environment-Interactive Planning for Embodied Complex-Question Answering
Ning Lan
Baoshan Ou
Xuemei Xie
G. Shi
LM&Ro
289
1
0
01 Apr 2025
MVSAnywhere: Zero-Shot Multi-View Stereo
Computer Vision and Pattern Recognition (CVPR), 2025
Sergio Izquierdo
Mohamed Sayed
Michael Firman
Guillermo Garcia-Hernando
Daniyar Turmukhambetov
Javier Civera
Oisin Mac Aodha
Gabriel J. Brostow
Jamie Watson
3DV
327
11
0
28 Mar 2025
OpenLex3D: A Tiered Evaluation Benchmark for Open-Vocabulary 3D Scene Representations
Christina Kassab
Sacha Morin
Martin Buchner
Matías Mattamala
Kumaraditya Gupta
Abhinav Valada
Liam Paull
Maurice F. Fallon
3DV
ELM
228
3
0
25 Mar 2025
Scene-agnostic Pose Regression for Visual Localization
Computer Vision and Pattern Recognition (CVPR), 2025
Junwei Zheng
Ruiping Liu
Yuxiao Chen
Zhenfang Chen
Kailun Yang
Kailai Li
Rainer Stiefelhagen
169
2
0
25 Mar 2025
DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation
Karim Abou Zeid
Kadir Yilmaz
Daan de Geus
Alexander Hermans
David B. Adrian
Timm Linder
Bastian Leibe
118
7
0
24 Mar 2025
SG-Tailor: Inter-Object Commonsense Relationship Reasoning for Scene Graph Manipulation
Haoliang Shang
Hanyu Wu
Guangyao Zhai
Boyang Sun
Fangjinhua Wang
F. Tombari
Marc Pollefeys
229
1
0
23 Mar 2025
Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation
Ziming Wei
Bingqian Lin
Yunshuang Nie
Jiaqi Chen
Shikui Ma
Hang Xu
Xiaodan Liang
420
3
0
23 Mar 2025
Do Visual Imaginations Improve Vision-and-Language Navigation Agents?
Computer Vision and Pattern Recognition (CVPR), 2025
Akhil Perincherry
Jacob Krantz
Stefan Lee
LM&Ro
220
6
0
20 Mar 2025
IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D Scenes
IEEE International Conference on Robotics and Automation (ICRA), 2025
Haochen Zhang
Nader Zantout
Pujith Kachana
Ji Zhang
Ji Zhang
VGen
234
2
0
20 Mar 2025
UniK3D: Universal Camera Monocular 3D Estimation
Computer Vision and Pattern Recognition (CVPR), 2025
Luigi Piccinelli
Daniel Gehrig
Mattia Segu
Yifan Yang
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
216
11
0
20 Mar 2025
CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language Models
Computer Vision and Pattern Recognition (CVPR), 2025
Yiqi Zhu
Zihan Wang
Chen Zhang
Ziwei Sun
Yang Liu
CoGe
VLM
209
2
0
18 Mar 2025
MoK-RAG: Mixture of Knowledge Paths Enhanced Retrieval-Augmented Generation for Embodied AI Environments
Zhengsheng Guo
Linwei Zheng
Xinyang Chen
X. Bai
Kai Chen
Min Zhang
217
0
0
18 Mar 2025
FlexVLN: Flexible Adaptation for Diverse Vision-and-Language Navigation Tasks
IEEE transactions on multimedia (TMM), 2025
Siqi Zhang
Yanyuan Qiao
Qunbo Wang
Longteng Guo
Zhihua Wei
Qingbin Liu
LM&Ro
272
10
0
18 Mar 2025
MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation
P. Zhang
Xianqiang Gao
Yuhan Wu
Pengan Chen
Dong Wang
Zechuan Wang
Jiangwei Zhong
Yan Ding
Xiaochen Li
LM&Ro
205
11
0
14 Mar 2025
OSMa-Bench: Evaluating Open Semantic Mapping Under Varying Lighting Conditions
Maxim Popov
Regina Kurkova
Mikhail Iumanov
Jaafar Mahmoud
Sergey Kolyubin
191
1
0
13 Mar 2025
PanoGen++: Domain-Adapted Text-Guided Panoramic Environment Generation for Vision-and-Language Navigation
Neural Networks (NN), 2025
Sen Wang
Dongliang Zhou
Liang Xie
Chao Xu
Ye Yan
Erwei Yin
DiffM
286
6
0
13 Mar 2025
Embodied Crowd Counting
Runling Long
Yunlong Wang
Jia Wan
Xiang Deng
Xinting Zhu
Weili Guan
Antoni B. Chan
Liqiang Nie
253
0
0
11 Mar 2025
Reasoning in visual navigation of end-to-end trained agents: a dynamical systems approach
Computer Vision and Pattern Recognition (CVPR), 2025
Steeven Janny
Hervé Poirier
L. Antsfeld
G. Bono
G. Monaci
Boris Chidlovskii
Francesco Giuliari
Alessio Del Bue
Christian Wolf
LM&Ro
521
3
0
11 Mar 2025
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Computer Vision and Pattern Recognition (CVPR), 2025
Xin Wen
Bingchen Zhao
Yilun Chen
Jiangmiao Pang
Xiaojuan Qi
LM&Ro
412
3
0
10 Mar 2025
Handle Object Navigation as Weighted Traveling Repairman Problem
Ruimeng Liu
Xinhang Xu
Shenghai Yuan
Lihua Xie
321
5
0
10 Mar 2025
WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation
Dujun Nie
Xianda Guo
Yiqun Duan
Ruijun Zhang
Long Chen
LM&Ro
581
18
0
04 Mar 2025
OVAMOS: A Framework for Open-Vocabulary Multi-Object Search in Unknown Environments
Qianwei Wang
Yifan Xu
V. Kamat
Carol Menassa
228
2
0
03 Mar 2025
AirRoom: Objects Matter in Room Reidentification
Computer Vision and Pattern Recognition (CVPR), 2025
Runmao Yao
Yi Du
Zhuoqun Chen
Haoze Zheng
Chen Wang
298
0
0
03 Mar 2025
UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler
Luigi Piccinelli
Daniel Gehrig
Yifan Yang
Mattia Segu
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
337
57
0
27 Feb 2025
Ground-level Viewpoint Vision-and-Language Navigation in Continuous Environments
IEEE International Conference on Robotics and Automation (ICRA), 2025
Zerui Li
Gengze Zhou
Haodong Hong
Yanyan Shao
Wenqi Lyu
Yanyuan Qiao
Qi Wu
247
4
0
26 Feb 2025
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Neural Information Processing Systems (NeurIPS), 2024
Luca Barsellotti
Roberto Bigazzi
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
401
2
0
20 Feb 2025
IM360: Large-scale Indoor Mapping with 360 Cameras
Dongki Jung
Jaehoon Choi
Yonghan Lee
Dinesh Manocha
363
1
0
18 Feb 2025
REGNav: Room Expert Guided Image-Goal Navigation
AAAI Conference on Artificial Intelligence (AAAI), 2025
Pengna Li
Kangyi Wu
Jingwen Fu
Sanping Zhou
323
5
0
15 Feb 2025
NextBestPath: Efficient 3D Mapping of Unseen Environments
International Conference on Learning Representations (ICLR), 2025
Shiyao Li
Antoine Guédon
Clémentin Boittiaux
Shizhe Chen
Vincent Lepetit
211
3
0
07 Feb 2025
Zero-Shot Novel View and Depth Synthesis with Multi-View Geometric Diffusion
Computer Vision and Pattern Recognition (CVPR), 2025
Vitor Campagnolo Guizilini
Muhammad Zubair Irshad
Dian Chen
G. Shakhnarovich
Rares Andrei Ambrus
DiffM
224
6
0
30 Jan 2025
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
IEEE Robotics and Automation Letters (IEEE RA-L), 2025
Kei Katsumata
Motonari Kambara
Daichi Yashima
Ryosuke Korekata
Komei Sugiura
356
0
0
28 Jan 2025
Visual Semantic Navigation with Real Robots
Carlos Gutiérrez-Álvarez
Pablo Ríos-Navarro
Rafael Flor-Rodríguez
Francisco Javier Acevedo-Rodríguez
Roberto J. López-Sastre
350
4
0
10 Jan 2025
FrontierNet: Learning Visual Cues to Explore
IEEE Robotics and Automation Letters (IEEE RA-L), 2025
Boyang Sun
Hanzhi Chen
Stefan Leutenegger
Cesar Cadena
Marc Pollefeys
Hermann Blum
292
6
0
08 Jan 2025
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models
Zhangyang Qi
Zhixiong Zhang
Ye Fang
Yuan Liu
Hengshuang Zhao
587
47
0
02 Jan 2025
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Yue Zhang
Ziqiao Ma
Jialu Li
Yanyuan Qiao
Zun Wang
J. Chai
Qi Wu
Joey Tianyi Zhou
Parisa Kordjamshidi
LRM
319
57
0
31 Dec 2024
Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
IEEE Robotics and Automation Letters (RA-L), 2024
Daichi Yashima
Ryosuke Korekata
Komei Sugiura
361
0
0
21 Dec 2024
MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds
Computer Vision and Pattern Recognition (CVPR), 2024
Z-H. Tang
Yuchen Fan
Dilin Wang
Hongyu Xu
Rakesh Ranjan
Alex Schwing
Zhicheng Yan
3DGS
VGen
3DV
192
73
0
09 Dec 2024
Splatter-360: Generalizable 360
∘
^{\circ}
∘
Gaussian Splatting for Wide-baseline Panoramic Images
Computer Vision and Pattern Recognition (CVPR), 2024
Zheng Chen
Chenming Wu
Zhelun Shen
Chen Zhao
Weicai Ye
Haocheng Feng
Errui Ding
Song-Hai Zhang
3DGS
195
0
0
09 Dec 2024
TANGO: Training-free Embodied AI Agents for Open-world Tasks
Computer Vision and Pattern Recognition (CVPR), 2024
Filippo Ziliotto
Tommaso Campari
Luciano Serafini
Lamberto Ballan
LLMAG
LM&Ro
MLLM
LRM
299
10
0
05 Dec 2024
Multi-robot autonomous 3D reconstruction using Gaussian splatting with Semantic guidance
IEEE Robotics and Automation Letters (RA-L), 2024
Jing Zeng
Qi Ye
Tianle Liu
Yang Xu
Jin Li
Jinming Xu
Liang Li
Jiming Chen
3DGS
3DV
175
3
0
03 Dec 2024
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences
Computer Vision and Pattern Recognition (CVPR), 2024
Hongyan Zhi
Peihao Chen
Junyan Li
Shuailei Ma
Xinyu Sun
Tianhang Xiang
Yinjie Lei
Mingkui Tan
Chuang Gan
391
21
0
02 Dec 2024
AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans
Dillon Loh
Tomasz Bednarz
Xinxing Xia
Frank Guan
244
1
0
27 Nov 2024
g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks
Computer Vision and Pattern Recognition (CVPR), 2024
Zihan Wang
Gim Hee Lee
216
5
0
26 Nov 2024
DiffDesign: Controllable Diffusion with Meta Prior for Efficient Interior Design Generation
PLoS ONE (PLoS ONE), 2024
Yuxuan Yang
Wenwen Qiang
DiffM
522
2
0
25 Nov 2024
3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning
Computer Vision and Pattern Recognition (CVPR), 2024
Yuncong Yang
Han Yang
Jiachen Zhou
Peihao Chen
Hongxin Zhang
Yilun Du
Chuang Gan
376
0
0
23 Nov 2024
VLN-Game: Vision-Language Equilibrium Search for Zero-Shot Semantic Navigation
Bangguo Yu
Yuzhen Liu
Lei Han
Hamidreza Kasaei
Tingguang Li
M. Cao
LM&Ro
293
8
0
18 Nov 2024
Architect: Generating Vivid and Interactive 3D Scenes with Hierarchical 2D Inpainting
Neural Information Processing Systems (NeurIPS), 2024
Yian Wang
Xiaowen Qiu
Jiageng Liu
Zhehuan Chen
Jiting Cai
Yufei Wang
Tsun-Hsuan Wang
Zhou Xian
Chuang Gan
VGen
AI4CE
236
20
0
14 Nov 2024
VLA-3D: A Dataset for 3D Semantic Scene Understanding and Navigation
Haochen Zhang
Nader Zantout
Pujith Kachana
Zongyuan Wu
Ji Zhang
Ji Zhang
3DV
LM&Ro
237
11
0
05 Nov 2024
Previous
1
2
3
4
5
6
7
8
Next