Matterport3D: Learning from RGB-D Data in Indoor Environments

18 September 2017

Matthias Nießner

Shuran Song

Papers citing "Matterport3D: Learning from RGB-D Data in Indoor Environments"

50 / 1,327 papers shown

Conditional Panoramic Image Generation via Masked Autoregressive Modeling

322

22 May 2025

Direction-Aware Neural Acoustic Fields for Few-Shot Interpolation of Ambisonic Impulse Responses

250

19 May 2025

SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence

397

19 May 2025

BadNAVer: Exploring Jailbreak Attacks On Vision-and-Language Navigation

678

18 May 2025

Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?

276

17 May 2025

Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation

396

16 May 2025

Deploying Foundation Model-Enabled Air and Ground Robots in the Field: Challenges and Opportunities

327

14 May 2025

NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance

389

13 May 2025

VISTA: Generative Visual Imagination for Vision-and-Language Navigation

575

09 May 2025

DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual GroundingInternational Conference on Learning Representations (ICLR), 2025

311

08 May 2025

CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global MemoryAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

584

08 May 2025

LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs

Kornélia Sára Szatmáry

Fei Wang

361

06 May 2025

Estimating Commonsense Scene Composition on Belief Scene GraphsIEEE International Conference on Robotics and Automation (ICRA), 2025

Mario A. V. Saucedo

Vignesh Kottayam Viswanathan

Christoforos Kanellakis

G. Nikolakopoulos

3DV

314

05 May 2025

PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications

Trisanth Srinivasan

Santosh Patapati

379

03 May 2025

A Survey of Robotic Navigation and Manipulation with Physics Simulators in the Era of Embodied AI

394

01 May 2025

A Survey of Interactive Generative Video

432

30 Apr 2025

CasaGPT: Cuboid Arrangement and Scene Assembly for Interior DesignComputer Vision and Pattern Recognition (CVPR), 2025

275

28 Apr 2025

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning

...

433

26 Apr 2025

SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models

796

25 Apr 2025

Dynamic Camera Poses and Where to Find ThemComputer Vision and Pattern Recognition (CVPR), 2025

411

24 Apr 2025

Multimodal Perception for Goal-oriented Navigation: A Survey

I-Tak Ieong

Hao Tang

LM&Ro LRM

322

22 Apr 2025

ApexNav: An Adaptive Exploration Strategy for Zero-Shot Object Navigation with Target-centric Semantic FusionIEEE Robotics and Automation Letters (IEEE RA-L), 2025

596

20 Apr 2025

Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding

Friedrich Fraundorfer

3DPC 3DV

1.2K

18 Apr 2025

RefComp: A Reference-guided Unified Framework for Unpaired Point Cloud CompletionIEEE transactions on multimedia (TMM), 2025

259

18 Apr 2025

Digital Twin Generation from Visual Data: A Survey

465

17 Apr 2025

Real-World Depth Recovery via Structure Uncertainty Modeling and Inaccurate GT Depth Fitting

Delong Suzhang

Meng Yang

219

16 Apr 2025

CL-CoTNav: Closed-Loop Hierarchical Chain-of-Thought for Zero-Shot Object-Goal Navigation with Vision-Language Models

380

11 Apr 2025

Endowing Embodied Agents with Spatial Reasoning Capabilities for Vision-and-Language Navigation

Luo Ling

Bai Qianqian

LM&Ro

268

09 Apr 2025

SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic BindingComputer Vision and Pattern Recognition (CVPR), 2025

Mingfei Chen

I. D. Gebru

Ishwarya Ananthabhotla

278

08 Apr 2025

LPA3D: 3D Room-Level Scene Generation from In-the-Wild Images

240

03 Apr 2025

Multimodal Fusion and Vision-Language Models: A Survey for Robot VisionInformation Fusion (Inf. Fusion), 2025

...

445

03 Apr 2025

WorldScore: A Unified Evaluation Benchmark for World Generation

397

01 Apr 2025

COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation

530

31 Mar 2025

Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model

376

30 Mar 2025

Empowering Large Language Models with 3D Situation AwarenessComputer Vision and Pattern Recognition (CVPR), 2025

327

29 Mar 2025

From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D

...

613

29 Mar 2025

Open-Vocabulary Semantic Segmentation with Uncertainty Alignment for Robotic Scene Understanding in Indoor Building Environments

Yifan Xu

V. Kamat

Carol Menassa

294

29 Mar 2025

uLayout: Unified Room Layout Estimation for Perspective and Panoramic ImagesIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025

237

27 Mar 2025

Scene-agnostic Pose Regression for Visual LocalizationComputer Vision and Pattern Recognition (CVPR), 2025

217

25 Mar 2025

Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor SpacesComputer Vision and Pattern Recognition (CVPR), 2025

377

24 Mar 2025

SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining

...

660

23 Mar 2025

MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning SegmentationFoundations and Trends® in Signal Processing (FTSP), 2025

373

23 Mar 2025

Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation

488

23 Mar 2025

Distilling Monocular Foundation Model for Fine-grained Depth CompletionComputer Vision and Pattern Recognition (CVPR), 2025

291

21 Mar 2025

OffsetOPT: Explicit Surface Reconstruction without NormalsComputer Vision and Pattern Recognition (CVPR), 2025

Huan Lei

3DPC

283

20 Mar 2025

Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene UnderstandingComputer Vision and Pattern Recognition (CVPR), 2025

1.1K

20 Mar 2025

IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D ScenesIEEE International Conference on Robotics and Automation (ICRA), 2025

335

20 Mar 2025

Do Visual Imaginations Improve Vision-and-Language Navigation Agents?Computer Vision and Pattern Recognition (CVPR), 2025

264

20 Mar 2025

UniK3D: Universal Camera Monocular 3D EstimationComputer Vision and Pattern Recognition (CVPR), 2025

269

20 Mar 2025

SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban MeshesComputer Vision and Pattern Recognition (CVPR), 2025

310

19 Mar 2025