ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.06158
  4. Cited By
Matterport3D: Learning from RGB-D Data in Indoor Environments

Matterport3D: Learning from RGB-D Data in Indoor Environments

18 September 2017
Angel X. Chang
Angela Dai
Thomas Funkhouser
Maciej Halber
Matthias Nießner
Manolis Savva
Shuran Song
Andy Zeng
Yinda Zhang
    3DV3DPC
ArXiv (abs)PDFHTML

Papers citing "Matterport3D: Learning from RGB-D Data in Indoor Environments"

50 / 1,327 papers shown
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Chaoyang Wang
Xiangtai Li
Lu Qi
X. Lin
Jinbin Bai
Qianyu Zhou
Yunhai Tong
DiffM
322
3
0
22 May 2025
Direction-Aware Neural Acoustic Fields for Few-Shot Interpolation of Ambisonic Impulse Responses
Direction-Aware Neural Acoustic Fields for Few-Shot Interpolation of Ambisonic Impulse Responses
Christopher Ick
Gordon Wichern
Yoshiki Masuyama
François Germain
Jonathan Le Roux
AI4CE
250
3
0
19 May 2025
SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence
SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence
Jiabin Chen
Haiping Wang
Jinpeng Li
Yuan Liu
Zhen Dong
Bisheng Yang
397
2
0
19 May 2025
BadNAVer: Exploring Jailbreak Attacks On Vision-and-Language Navigation
BadNAVer: Exploring Jailbreak Attacks On Vision-and-Language Navigation
Wenqi Lyu
Zerui Li
Yanyuan Qiao
Qi Wu
AAML
678
1
0
18 May 2025
Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?
Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?
Zihao Dongfang
Xu Zheng
Ziqiao Weng
Yuanhuiyi Lyu
Danda Pani Paudel
Luc Van Gool
Kailun Yang
Xuming Hu
LRM
276
8
0
17 May 2025
Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Zihan Wang
Seungjun Lee
Gim Hee Lee
VGen
396
4
0
16 May 2025
Deploying Foundation Model-Enabled Air and Ground Robots in the Field: Challenges and Opportunities
Deploying Foundation Model-Enabled Air and Ground Robots in the Field: Challenges and Opportunities
Zachary Ravichandran
Fernando Cladera
Jason Hughes
Varun Murali
M. Hsieh
George J. Pappas
Camillo J Taylor
Vijay Kumar
LM&Ro
327
2
0
14 May 2025
NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance
NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance
Wenzhe Cai
Jiaqi Peng
Yuqiang Yang
Yanmei Zhang
Meng Wei
Hanqing Wang
Yilun Chen
Tai Wang
Jiangmiao Pang
389
21
0
13 May 2025
VISTA: Generative Visual Imagination for Vision-and-Language Navigation
VISTA: Generative Visual Imagination for Vision-and-Language Navigation
Yanjia Huang
Mingyang Wu
Renjie Li
Zhengzhong Tu
LM&Ro
575
4
0
09 May 2025
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual GroundingInternational Conference on Learning Representations (ICLR), 2025
Henry Zheng
Hao Shi
Qihang Peng
Yong Xien Chng
Rui Huang
Yepeng Weng
Peng Wang
Gao Huang
311
8
0
08 May 2025
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global MemoryAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Weichen Zhang
Chen Gao
Shiquan Yu
Ruiying Peng
Baining Zhao
Qian Zhang
Jinqiang Cui
Xinlei Chen
Yongqian Li
LLMAGLM&Ro
584
7
0
08 May 2025
LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs
LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs
Xinyuan Zhang
Yonglin Tian
Fei Lin
Yue Liu
Jing Ma
Kornélia Sára Szatmáry
Fei Wang
361
6
0
06 May 2025
Estimating Commonsense Scene Composition on Belief Scene Graphs
Estimating Commonsense Scene Composition on Belief Scene GraphsIEEE International Conference on Robotics and Automation (ICRA), 2025
Mario A. V. Saucedo
Vignesh Kottayam Viswanathan
Christoforos Kanellakis
G. Nikolakopoulos
3DV
314
0
0
05 May 2025
PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications
PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications
Trisanth Srinivasan
Santosh Patapati
379
3
0
03 May 2025
A Survey of Robotic Navigation and Manipulation with Physics Simulators in the Era of Embodied AI
A Survey of Robotic Navigation and Manipulation with Physics Simulators in the Era of Embodied AI
Lik Hang Kenny Wong
Xueyang Kang
Kaixin Bai
Jianwei Zhang
394
11
0
01 May 2025
A Survey of Interactive Generative Video
A Survey of Interactive Generative Video
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Kun Gai
Hao Chen
Xihui Liu
VGen
432
16
0
30 Apr 2025
CasaGPT: Cuboid Arrangement and Scene Assembly for Interior Design
CasaGPT: Cuboid Arrangement and Scene Assembly for Interior DesignComputer Vision and Pattern Recognition (CVPR), 2025
Weitao Feng
Hang Zhou
Jing Liao
Li Cheng
Wenbo Zhou
3DV
275
4
0
28 Apr 2025
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
Haoran Geng
Feishi Wang
Songlin Wei
Yuchen Ren
Bangjun Wang
...
Hao Dong
Siyuan Huang
Yue Wang
Jitendra Malik
Pieter Abbeel
433
40
0
26 Apr 2025
SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models
SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models
Nader Zantout
Haochen Zhang
Pujith Kachana
J. Qiu
Ji Zhang
Ji Zhang
Wenshan Wang
LM&RoLRM
796
6
0
25 Apr 2025
Dynamic Camera Poses and Where to Find Them
Dynamic Camera Poses and Where to Find ThemComputer Vision and Pattern Recognition (CVPR), 2025
C. Rockwell
Joseph Tung
Nayeon Lee
Xuan Li
David Fouhey
Chen-Hsuan Lin
411
14
0
24 Apr 2025
Multimodal Perception for Goal-oriented Navigation: A Survey
Multimodal Perception for Goal-oriented Navigation: A Survey
I-Tak Ieong
Hao Tang
LM&RoLRM
322
1
0
22 Apr 2025
ApexNav: An Adaptive Exploration Strategy for Zero-Shot Object Navigation with Target-centric Semantic Fusion
ApexNav: An Adaptive Exploration Strategy for Zero-Shot Object Navigation with Target-centric Semantic FusionIEEE Robotics and Automation Letters (IEEE RA-L), 2025
Mingjie Zhang
Yuheng Du
Chengkai Wu
Jinni Zhou
Zhenchao Qi
Jun Ma
Boyu Zhou
596
8
0
20 Apr 2025
Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding
Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding
Yuchen Rao
Stefan Ainetter
Sinisa Stekovic
Vincent Lepetit
Friedrich Fraundorfer
3DPC3DV
1.2K
0
0
18 Apr 2025
RefComp: A Reference-guided Unified Framework for Unpaired Point Cloud Completion
RefComp: A Reference-guided Unified Framework for Unpaired Point Cloud CompletionIEEE transactions on multimedia (TMM), 2025
Yixuan Yang
Jinyu Yang
Zixiang Zhao
Victor Sanchez
Feng Zheng
259
0
0
18 Apr 2025
Digital Twin Generation from Visual Data: A Survey
Digital Twin Generation from Visual Data: A Survey
Andrew Melnik
Benjamin Alt
Giang Hoang Nguyen
Artur Wilkowski
Maciej Stefańczyk
Qirui Wu
Sinan Harms
Helge Rhodin
Manolis Savva
Michael Beetz
3DGSVGen
465
4
0
17 Apr 2025
Real-World Depth Recovery via Structure Uncertainty Modeling and Inaccurate GT Depth Fitting
Real-World Depth Recovery via Structure Uncertainty Modeling and Inaccurate GT Depth Fitting
Delong Suzhang
Meng Yang
219
0
0
16 Apr 2025
CL-CoTNav: Closed-Loop Hierarchical Chain-of-Thought for Zero-Shot Object-Goal Navigation with Vision-Language Models
CL-CoTNav: Closed-Loop Hierarchical Chain-of-Thought for Zero-Shot Object-Goal Navigation with Vision-Language Models
Yuxin Cai
Xiangkun He
Maonan Wang
Hongliang Guo
W. Yau
Chen Lv
LM&RoLRM
380
6
0
11 Apr 2025
Endowing Embodied Agents with Spatial Reasoning Capabilities for Vision-and-Language Navigation
Endowing Embodied Agents with Spatial Reasoning Capabilities for Vision-and-Language Navigation
Luo Ling
Bai Qianqian
LM&Ro
268
3
0
09 Apr 2025
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic BindingComputer Vision and Pattern Recognition (CVPR), 2025
Mingfei Chen
I. D. Gebru
Ishwarya Ananthabhotla
Christian Richardt
Dejan Marković
Jake Sandakly
Steven Krenn
Todd Keebler
Eli Shlizerman
Alexander Richard
278
2
0
08 Apr 2025
LPA3D: 3D Room-Level Scene Generation from In-the-Wild Images
LPA3D: 3D Room-Level Scene Generation from In-the-Wild Images
M. Yang
Yu-Xiao Guo
Yang Liu
Bin Zhou
Xin Tong
3DV
240
0
0
03 Apr 2025
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Multimodal Fusion and Vision-Language Models: A Survey for Robot VisionInformation Fusion (Inf. Fusion), 2025
Xiaofeng Han
Shunpeng Chen
Zenghuang Fu
Zhe Feng
Lue Fan
...
Li Guo
Weiliang Meng
Xiaopeng Zhang
Rongtao Xu
Shibiao Xu
445
37
0
03 Apr 2025
WorldScore: A Unified Evaluation Benchmark for World Generation
WorldScore: A Unified Evaluation Benchmark for World Generation
Haoyi Duan
Hong-Xing Yu
Sirui Chen
L. Fei-Fei
Jiajun Wu
VGen
397
46
0
01 Apr 2025
COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation
COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation
Siqi Zhang
Yanyuan Qiao
Qunbo Wang
Zike Yan
Qi Wu
Zhihua Wei
Qingbin Liu
530
3
0
31 Mar 2025
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model
Jannik Endres
Oliver Hahn
Charles Corbière
Simone Schaub-Meyer
Stefan Roth
Alexandre Alahi
MDE
376
0
0
30 Mar 2025
Empowering Large Language Models with 3D Situation Awareness
Empowering Large Language Models with 3D Situation AwarenessComputer Vision and Pattern Recognition (CVPR), 2025
Zhihao Yuan
Yibo Peng
Jinke Ren
Yinghong Liao
Yatong Han
Chun-Mei Feng
Hengshuang Zhao
G. Li
Shuguang Cui
Ge Wang
327
3
0
29 Mar 2025
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
Jiahui Zhang
Yurui Chen
Yanpeng Zhou
Yueming Xu
Ze Huang
...
Xinyue Cai
G. Huang
Xingyue Quan
Hang Xu
Li Zhang
LRM
613
2
0
29 Mar 2025
Open-Vocabulary Semantic Segmentation with Uncertainty Alignment for Robotic Scene Understanding in Indoor Building Environments
Open-Vocabulary Semantic Segmentation with Uncertainty Alignment for Robotic Scene Understanding in Indoor Building Environments
Yifan Xu
V. Kamat
Carol Menassa
294
0
0
29 Mar 2025
uLayout: Unified Room Layout Estimation for Perspective and Panoramic Images
uLayout: Unified Room Layout Estimation for Perspective and Panoramic ImagesIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Jonathan Lee
Bolivar Solarte
Chin-Hsuan Wu
Jin-Cheng Jhang
Fu-En Wang
Yi-Hsuan Tsai
Min Sun
237
1
0
27 Mar 2025
Scene-agnostic Pose Regression for Visual Localization
Scene-agnostic Pose Regression for Visual LocalizationComputer Vision and Pattern Recognition (CVPR), 2025
Junwei Zheng
Ruiping Liu
Yuxiao Chen
Zhenfang Chen
Kailun Yang
Kailai Li
Rainer Stiefelhagen
217
2
0
25 Mar 2025
Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor Spaces
Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor SpacesComputer Vision and Pattern Recognition (CVPR), 2025
Chenyangguang Zhang
Alexandros Delitzas
Fangjinhua Wang
Ruida Zhang
Xiangyang Ji
Marc Pollefeys
Francis Engelmann
3DV3DPC
377
22
0
24 Mar 2025
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
Yue Li
Qi Ma
Runyi Yang
Huapeng Li
Mengjiao Ma
...
E. Konukoglu
Theo Gevers
Luc Van Gool
Martin R. Oswald
Danda Pani Paudel
3DGSVLM
660
20
0
23 Mar 2025
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning SegmentationFoundations and Trends® in Signal Processing (FTSP), 2025
Jiaxin Huang
Runnan Chen
Ziwen Li
Zhengqing Gao
Xiao He
Yandong Guo
Mingming Gong
Tongliang Liu
LRM
373
8
0
23 Mar 2025
Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation
Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation
Ziming Wei
Bingqian Lin
Yunshuang Nie
Jiaqi Chen
Shikui Ma
Hang Xu
Xiaodan Liang
488
3
0
23 Mar 2025
Distilling Monocular Foundation Model for Fine-grained Depth Completion
Distilling Monocular Foundation Model for Fine-grained Depth CompletionComputer Vision and Pattern Recognition (CVPR), 2025
Yingping Liang
Yutao Hu
Wenqi Shao
Ying Fu
MDE
291
8
0
21 Mar 2025
OffsetOPT: Explicit Surface Reconstruction without Normals
OffsetOPT: Explicit Surface Reconstruction without NormalsComputer Vision and Pattern Recognition (CVPR), 2025
Huan Lei
3DPC
283
0
0
20 Mar 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene UnderstandingComputer Vision and Pattern Recognition (CVPR), 2025
Jinlong Li
Cristiano Saltori
Fabio Poiesi
Andrii Zadaianchuk
1.1K
7
0
20 Mar 2025
IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D Scenes
IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D ScenesIEEE International Conference on Robotics and Automation (ICRA), 2025
Haochen Zhang
Nader Zantout
Pujith Kachana
Ji Zhang
Ji Zhang
VGen
335
2
0
20 Mar 2025
Do Visual Imaginations Improve Vision-and-Language Navigation Agents?
Do Visual Imaginations Improve Vision-and-Language Navigation Agents?Computer Vision and Pattern Recognition (CVPR), 2025
Akhil Perincherry
Jacob Krantz
Stefan Lee
LM&Ro
264
7
0
20 Mar 2025
UniK3D: Universal Camera Monocular 3D Estimation
UniK3D: Universal Camera Monocular 3D EstimationComputer Vision and Pattern Recognition (CVPR), 2025
Luigi Piccinelli
Daniel Gehrig
Mattia Segu
Yifan Yang
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
269
17
0
20 Mar 2025
SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban Meshes
SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban MeshesComputer Vision and Pattern Recognition (CVPR), 2025
Weixiao Gao
Liangliang Nan
H. Ledoux
3DV3DPC
310
3
0
19 Mar 2025
Previous
12345...252627
Next