v1v2v3 (latest)

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Computer Vision and Pattern Recognition (CVPR), 2024

7 June 2024

Joyce Chai

ArXiv (abs)PDF HTML HuggingFace (31 upvotes)

Papers citing "3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination"

31 / 81 papers shown

3D Concept Learning and Reasoning from Multi-View ImagesComputer Vision and Pattern Recognition (CVPR), 2023

Chuang Gan

274

20 Mar 2023

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsInternational Conference on Machine Learning (ICML), 2023

Silvio Savarese

1.3K

6,661

30 Jan 2023

Orbit: A Unified Simulation Framework for Interactive Robot Learning EnvironmentsIEEE Robotics and Automation Letters (RA-L), 2023

Mayank Mittal

...

Marco Hutter

311

409

10 Jan 2023

Is GPT-3 a Good Data Annotator?Annual Meeting of the Association for Computational Linguistics (ACL), 2022

325

307

20 Dec 2022

ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved Visio-Linguistic Models in 3D ScenesIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

Hsin-Ying Lee

267

12 Dec 2022

Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-trainingConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

302

14 Oct 2022

SQA3D: Situated Question Answering in 3D ScenesInternational Conference on Learning Representations (ICLR), 2022

Xiaojian Ma

505

245

14 Oct 2022

Mask3D: Mask Transformer for 3D Semantic Instance SegmentationIEEE International Conference on Robotics and Automation (ICRA), 2022

Siyu Tang

321

301

06 Oct 2022

Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases

Zhuo Li

188

05 Jul 2022

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

...

318

367

14 Jun 2022

Flamingo: a Visual Language Model for Few-Shot LearningNeural Information Processing Systems (NeurIPS), 2022

Jean-Baptiste Alayrac

...

695

4,826

29 Apr 2022

Language-Grounded Indoor 3D Semantic Segmentation in the WildEuropean Conference on Computer Vision (ECCV), 2022

385

257

16 Apr 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022

2.3K

14,449

28 Jan 2022

ScanQA: 3D Question Answering for Spatial Scene UnderstandingComputer Vision and Pattern Recognition (CVPR), 2021

436

325

20 Dec 2021

Habitat 2.0: Training Home Assistants to Rearrange their HabitatNeural Information Processing Systems (NeurIPS), 2021

...

393

638

28 Jun 2021

LoRA: Low-Rank Adaptation of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2021

OffRL AI4TS AI4CE ALM AIMat

1.6K

15,273

17 Jun 2021

Grounding 'Grounding' in NLPFindings (Findings), 2021

Khyathi Chandu

Yonatan Bisk

A. Black

167

04 Jun 2021

ManipulaTHOR: A Framework for Visual Object ManipulationComputer Vision and Pattern Recognition (CVPR), 2021

845

151

22 Apr 2021

Scan2Cap: Context-aware Dense Captioning in RGB-D ScansComputer Vision and Pattern Recognition (CVPR), 2020

Matthias Nießner

305

230

03 Dec 2020

3D-FRONT: 3D Furnished Rooms with layOuts and semaNTicsIEEE International Conference on Computer Vision (ICCV), 2020

289

354

18 Nov 2020

Experience Grounds Language

...

499

399

21 Apr 2020

RoboTHOR: An Open Simulation-to-Real Embodied AI PlatformComputer Vision and Pattern Recognition (CVPR), 2020

...

293

280

14 Apr 2020

ScanRefer: 3D Object Localization in RGB-D Scans using Natural LanguageEuropean Conference on Computer Vision (ECCV), 2019

Dave Zhenyu Chen

Angel X. Chang

Matthias Nießner

3DPC

421

507

18 Dec 2019

Structured3D: A Large Photo-realistic Dataset for Structured 3D ModelingEuropean Conference on Computer Vision (ECCV), 2019

354

346

01 Aug 2019

Habitat: A Platform for Embodied AI Research

...

Devi Parikh

570

1,676

02 Apr 2019

Unity: A General Platform for Intelligent Agents

Arthur Juliani

Vincent-Pierre Berges

...

Yuan Gao

402

907

07 Sep 2018

Object Hallucination in Image Captioning

425

593

06 Sep 2018

AI2-THOR: An Interactive 3D Environment for Visual AI

...

679

1,295

14 Dec 2017

ScanNet: Richly-annotated 3D Reconstructions of Indoor ScenesComputer Vision and Pattern Recognition (CVPR), 2017

Matthias Nießner

1.3K

4,902

14 Feb 2017

Modeling Context in Referring Expressions

570

1,522

31 Jul 2016

Microsoft COCO: Common Objects in ContextEuropean Conference on Computer Vision (ECCV), 2014

Piotr Dollár

17.5K

49,453

01 May 2014