Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
1709.06158
Cited By

Matterport3D: Learning from RGB-D Data in Indoor Environments

Matterport3D: Learning from RGB-D Data in Indoor Environments

18 September 2017

Thomas Funkhouser

Matthias Nießner

Shuran Song

ArXiv (abs)PDF HTML

Papers citing "Matterport3D: Learning from RGB-D Data in Indoor Environments"

50 / 1,327 papers shown

UNRealNet: Learning Uncertainty-Aware Navigation Features from
High-Fidelity Scans of Real Environments

UNRealNet: Learning Uncertainty-Aware Navigation Features from High-Fidelity Scans of Real Environments

Sebastian Scherer

Ali-Akbar Agha-Mohammadi

200

7

0

11 Jul 2024

SRPose: Two-view Relative Pose Estimation with Sparse Keypoints

SRPose: Two-view Relative Pose Estimation with Sparse Keypoints

302

3

0

11 Jul 2024

Fusion of Short-term and Long-term Attention for Video Mirror Detection

Fusion of Short-term and Long-term Attention for Video Mirror Detection

Yukun Lai

158

1

0

10 Jul 2024

Controlling Space and Time with Diffusion Models

Controlling Space and Time with Diffusion Models

Andrea Tagliasacchi

458

54

0

10 Jul 2024

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion
Model

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

Shitao Tang

Alan Yuille

221

5

0

09 Jul 2024

Aligning Cyber Space with Physical World: A Comprehensive Survey on
Embodied AI

Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI

Xiaodan Liang

Liang Lin

LM&Ro SyDa AI4CE

619

185

0

09 Jul 2024

Affordances-Oriented Planning using Foundation Models for Continuous
Vision-Language Navigation

Affordances-Oriented Planning using Foundation Models for Continuous Vision-Language Navigation

Kwan-Yee K. Wong

380

44

0

08 Jul 2024

Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene
Synthesis

Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis

Wengang Zhou

268

12

0

07 Jul 2024

Open Panoramic Segmentation

Open Panoramic Segmentation

Ruiping Liu

Kailun Yang

Jiaming Zhang

Rainer Stiefelhagen

276

14

0

02 Jul 2024

Object Segmentation from Open-Vocabulary Manipulation Instructions Based
on Optimal Transport Polygon Matching with Multimodal Foundation Models

Object Segmentation from Open-Vocabulary Manipulation Instructions Based on Optimal Transport Polygon Matching with Multimodal Foundation Models

Takayuki Nishimura

Motonari Kambara

Komei Sugiura

261

1

0

01 Jul 2024

Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene
Understanding

Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding

Fangxing Chen

Xueping Liu

Yongjin Liu

250

9

0

28 Jun 2024

HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model

HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model

Vikram S. Voleti

250

6

0

28 Jun 2024

SALVe: Semantic Alignment Verification for Floorplan Reconstruction from
Sparse Panoramas

SALVe: Semantic Alignment Verification for Floorplan Reconstruction from Sparse Panoramas

Ivaylo Boyadzhiev

Manjunath Narayana

Will Hutchcroft

James Hays

156

8

0

27 Jun 2024

Human-Aware Vision-and-Language Navigation: Bridging Simulation to
Reality with Dynamic Human Interactions

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions

Zhi-Qi Cheng

Teruko Mitamura

Alexander G. Hauptmann

268

21

0

27 Jun 2024

360 in the Wild: Dataset for Depth Prediction and View Synthesis

360 in the Wild: Dataset for Depth Prediction and View Synthesis

François Rameau

Jaesik Park

225

1

0

27 Jun 2024

MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation for
Effective-and-Efficient Vision-and-Language Navigation

MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation for Effective-and-Efficient Vision-and-Language Navigation

330

3

0

25 Jun 2024

Smart Feature is What You Need

Smart Feature is What You Need

298

0

0

22 Jun 2024

CityNav: A Large-Scale Dataset for Real-World Aerial Navigation

CityNav: A Large-Scale Dataset for Real-World Aerial Navigation

Taiki Miyanishi

325

23

0

20 Jun 2024

Estimating Map Completeness in Robot Exploration

Estimating Map Completeness in Robot Exploration

Marco Maria Ferrara

Giacomo Boracchi

Francesco Amigoni

194

3

0

19 Jun 2024

Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective
Distillation and Unlabeled Data Augmentation

Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data AugmentationNeural Information Processing Systems (NeurIPS), 2024

256

27

0

18 Jun 2024

Infinigen Indoors: Photorealistic Indoor Scenes using Procedural
Generation

Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation

Alexander Raistrick

...

Stamatis Alexandropoulos

385

79

0

17 Jun 2024

Solving Vision Tasks with Simple Photoreceptors Instead of Cameras

Solving Vision Tasks with Simple Photoreceptors Instead of Cameras

Andrew Spielberg

167

1

0

17 Jun 2024

Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language
Navigation

Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language NavigationConference on Robot Learning (CoRL), 2024

Xiangyang Li

Yeqi Liu

318

28

0

14 Jun 2024

A Two-Stage Masked Autoencoder Based Network for Indoor Depth Completion

A Two-Stage Masked Autoencoder Based Network for Indoor Depth Completion

3DV ViT 3DPC MDE

140

2

0

14 Jun 2024

MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations

MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations

...

350

34

0

13 Jun 2024

Pandora: Towards General World Model with Natural Language Actions and
Video States

Pandora: Towards General World Model with Natural Language Actions and Video States

Guangyi Liu

...

Zhengzhong Liu

Eric P. Xing

302

67

0

12 Jun 2024

Can Large Language Models Understand Spatial Audio?

Can Large Language Models Understand Spatial Audio?

Wenyi Yu

Guangzhi Sun

Xianzhao Chen

Tian Tan

...

Jun Zhang

Yuxuan Wang

Chao Zhang

350

18

0

12 Jun 2024

Hearing Anything Anywhere

Hearing Anything Anywhere

248

13

0

11 Jun 2024

Demonstrating HumanTHOR: A Simulation Platform and Benchmark for
Human-Robot Collaboration in a Shared Workspace

Demonstrating HumanTHOR: A Simulation Platform and Benchmark for Human-Robot Collaboration in a Shared Workspace

268

5

0

10 Jun 2024

Multimodal Contextualized Semantic Parsing from Speech

Multimodal Contextualized Semantic Parsing from Speech

David Harwath

182

1

0

10 Jun 2024

EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks
with Large Vision-Language Models

EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks with Large Vision-Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Xuanjing Huang

276

46

0

09 Jun 2024

Diverse 3D Human Pose Generation in Scenes based on Decoupled Structure

Diverse 3D Human Pose Generation in Scenes based on Decoupled Structure

211

1

0

09 Jun 2024

I2EDL: Interactive Instruction Error Detection and Localization

I2EDL: Interactive Instruction Error Detection and Localization

Francesco Taioli

Alessio Del Bue

Alessandro Farinelli

282

3

0

07 Jun 2024

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose
Estimation and Tracking

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and TrackingEuropean Conference on Computer Vision (ECCV), 2024

Hao Dong

300

35

0

06 Jun 2024

SelfReDepth: Self-Supervised Real-Time Depth Restoration for
Consumer-Grade Sensors

SelfReDepth: Self-Supervised Real-Time Depth Restoration for Consumer-Grade Sensors

Alexandre Duarte

Francisco Fernandes

João M. Pereira

Catarina Moreira

Jacinto C. Nascimento

Joaquim A. Jorge

235

3

0

05 Jun 2024

Balancing Performance and Efficiency in Zero-shot Robotic Navigation

Balancing Performance and Efficiency in Zero-shot Robotic Navigation

Dmytro Kuzmenko

251

0

0

05 Jun 2024

TopViewRS: Vision-Language Models as Top-View Spatial Reasoners

TopViewRS: Vision-Language Models as Top-View Spatial Reasoners

303

46

0

04 Jun 2024

CoNav: A Benchmark for Human-Centered Collaborative Navigation

CoNav: A Benchmark for Human-Centered Collaborative Navigation

294

2

0

04 Jun 2024

Why Only Text: Empowering Vision-and-Language Navigation with
Multi-modal Prompts

Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts

Sen Wang

Jiajun Liu

247

7

0

04 Jun 2024

Teledrive: An Embodied AI based Telepresence System

Teledrive: An Embodied AI based Telepresence System

Snehasis Banerjee

R. Roychoudhury

Abhijan Bhattacharya

Pradip Pramanick

Brojeshwar Bhowmick

332

3

0

01 Jun 2024

PanoNormal: Monocular Indoor 360° Surface Normal Estimation

PanoNormal: Monocular Indoor 360° Surface Normal Estimation

228

1

0

29 May 2024

GenWarp: Single Image to Novel Views with Semantic-Preserving Generative
Warping

GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping

Takashi Shibuya

Takuya Narihira

283

42

0

27 May 2024

Benchmarking General-Purpose In-Context Learning

Benchmarking General-Purpose In-Context Learning

526

5

0

27 May 2024

Vision-and-Language Navigation Generative Pretrained Transformer

Vision-and-Language Navigation Generative Pretrained Transformer

258

0

0

27 May 2024

Estimating Depth of Monocular Panoramic Image with Teacher-Student Model
Fusing Equirectangular and Spherical Representations

Estimating Depth of Monocular Panoramic Image with Teacher-Student Model Fusing Equirectangular and Spherical Representations

234

6

0

27 May 2024

Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model

Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model

Kuan-Chih Huang

Xiangtai Li

Ming-Hsuan Yang

376

22

0

27 May 2024

Map-based Modular Approach for Zero-shot Embodied Question Answering

Map-based Modular Approach for Zero-shot Embodied Question Answering

Taiki Miyanishi

300

6

0

26 May 2024

MAGIC: Map-Guided Few-Shot Audio-Visual Acoustics Modeling

MAGIC: Map-Guided Few-Shot Audio-Visual Acoustics Modeling

Kun-Li Channing Lin

152

0

0

22 May 2024

CRF360D: Monocular 360 Depth Estimation via Spherical Fully-Connected
CRFs

CRF360D: Monocular 360 Depth Estimation via Spherical Fully-Connected CRFs

Lin Wang

219

0

0

19 May 2024

Grounded 3D-LLM with Referent Tokens

Grounded 3D-LLM with Referent Tokens

Dahua Lin

336

77

0

16 May 2024

1 2 3...7 8 9...25 26 27