v1v2 (latest)

InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring

IEEE International Conference on Computer Vision (ICCV), 2021

1 March 2021

Papers citing "InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring"

50 / 91 papers shown

Unified Representation Space for 3D Visual Grounding

259

17 Jun 2025

FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding

316

16 Jun 2025

Grounding Beyond Detection: Enhancing Contextual Understanding in Embodied 3D Grounding

548

05 Jun 2025

Zero-Shot 3D Visual Grounding from Vision-Language Models

342

28 May 2025

LSVG: Language-Guided Scene Graphs with 2D-Assisted Multi-Modal Encoding for 3D Visual Grounding

646

07 May 2025

3DWG: 3D Weakly Supervised Visual Grounding via Category and Instance-Level AlignmentIEEE International Conference on Robotics and Automation (ICRA), 2025

280

03 May 2025

Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D

Krishna Murthy Jatavallabhula

...

313

19 Apr 2025

ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and ReasoningComputer Vision and Pattern Recognition (CVPR), 2025

222

30 Mar 2025

Empowering Large Language Models with 3D Situation AwarenessComputer Vision and Pattern Recognition (CVPR), 2025

441

29 Mar 2025

Vehicle-Scene Interaction: A Text-Driven 3D Lidar Place Recognition Method for Autonomous Driving

387

23 Mar 2025

AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based ReferringAAAI Conference on Artificial Intelligence (AAAI), 2025

309

17 Jan 2025

GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models

863

02 Jan 2025

LidaRefer: Context-aware Outdoor 3D Visual Grounding for Autonomous Driving

Yeong-Seung Baek

Heung-Seon Oh

377

07 Nov 2024

Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial AttentionNeural Information Processing Systems (NeurIPS), 2024

Haomeng Zhang

Chiao-An Yang

Raymond A. Yeh

364

29 Oct 2024

Joint Top-Down and Bottom-Up Frameworks for 3D Visual GroundingInternational Conference on Pattern Recognition (ICPR), 2024

Yang Liu

Daizong Liu

Wei Hu

3DPC

440

21 Oct 2024

VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual GroundingConference on Robot Learning (CoRL), 2024

301

17 Oct 2024

LESS: Label-Efficient and Single-Stage Referring 3D SegmentationNeural Information Processing Systems (NeurIPS), 2024

479

17 Oct 2024

Grounding 3D Scene Affordance From Egocentric Interactions

Cuiyu Liu

Wei Zhai

Yuhang Yang

Hongchen Luo

Sen Liang

Yang Cao

Zheng-Jun Zha

427

29 Sep 2024

Bayesian Self-Training for Semi-Supervised 3D SegmentationEuropean Conference on Computer Vision (ECCV), 2024

260

12 Sep 2024

R2G: Reasoning to Ground in 3D ScenesPattern Recognition (Pattern Recogn.), 2024

Yixuan Li

Zan Wang

Wei Liang

418

24 Aug 2024

3D-GRES: Generalized 3D Referring Expression Segmentation

Jiayi Ji

336

30 Jul 2024

RefMask3D: Language-Guided Transformer for 3D Referring Segmentation

Shuting He

Henghui Ding

316

25 Jul 2024

Multi-branch Collaborative Learning Network for 3D Visual Grounding

Zhekai Lin

370

07 Jul 2024

Dual Attribute-Spatial Relation Alignment for 3D Visual Grounding

359

13 Jun 2024

A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future DirectionsIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024

Wei Hu

434

09 Jun 2024

Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model

Xiangtai Li

Ming-Hsuan Yang

476

27 May 2024

When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models

...

462

16 May 2024

Generating Human Motion in 3D Scenes from Text DescriptionsComputer Vision and Pattern Recognition (CVPR), 2024

Zehong Shen

302

13 May 2024

Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners

Jiajun Wu

335

30 Apr 2024

Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization

289

17 Apr 2024

PointCloud-Text Matching: Benchmark Datasets and a Baseline

390

28 Mar 2024

Data-Efficient 3D Visual Grounding via Order-Aware Referring

Tung-Yu Wu

Sheng-Yu Huang

Yu-Chiang Frank Wang

794

25 Mar 2024

SeCG: Semantic-Enhanced 3D Visual Grounding via Cross-modal Graph Attention

Feng Xiao

Hongbin Xu

Qiuxia Wu

Wenxiong Kang

311

13 Mar 2024

A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes

254

12 Mar 2024

MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding

423

05 Mar 2024

Adversarial Testing for Visual Grounding via Image-Aware Property Reduction

Cheng Li

286

02 Mar 2024

M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts

384

17 Dec 2023

Weakly-Supervised 3D Visual Grounding based on Visual Language AlignmentIEEE transactions on multimedia (IEEE TMM), 2023

634

15 Dec 2023

Mono3DVG: 3D Visual Grounding in Monocular ImagesAAAI Conference on Artificial Intelligence (AAAI), 2023

Yangfan Zhan

Yuan. Yuan

Zhitong Xiong

MDE

298

13 Dec 2023

Uni3DL: Unified Model for 3D and Language Understanding

393

05 Dec 2023

Text2Loc: 3D Point Cloud Localization from Natural LanguageComputer Vision and Pattern Recognition (CVPR), 2023

434

27 Nov 2023

Visual Programming for Zero-shot Open-Vocabulary 3D Visual GroundingComputer Vision and Pattern Recognition (CVPR), 2023

399

26 Nov 2023

CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale Point Cloud DataNeural Information Processing Systems (NeurIPS), 2023

292

28 Oct 2023

Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation

Peng Wang

326

24 Oct 2023

CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual GroundingInternational Conference on Learning Representations (ICLR), 2023

460

10 Oct 2023

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an AgentIEEE International Conference on Robotics and Automation (ICRA), 2023

405

162

21 Sep 2023

Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection

286

18 Sep 2023

Multi3DRefer: Grounding Text Description to Multiple 3D ObjectsIEEE International Conference on Computer Vision (ICCV), 2023

Yiming Zhang

ZeMing Gong

Angel X. Chang

559

155

11 Sep 2023

Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual GroundingEuropean Conference on Computer Vision (ECCV), 2023

Ozan Unal

Daniel Gehrig

Suman Saha

Luc Van Gool

364

08 Sep 2023

Dense Object Grounding in 3D ScenesACM Multimedia (ACM MM), 2023

Wencan Huang

Daizong Liu

Wei Hu

287

05 Sep 2023