Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2403.16539
Cited By
v1
v2
v3
v4
v5 (latest)
Data-Efficient 3D Visual Grounding via Order-Aware Referring
25 March 2024
Tung-Yu Wu
Sheng-Yu Huang
Yu-Chiang Frank Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Data-Efficient 3D Visual Grounding via Order-Aware Referring"
48 / 48 papers shown
Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks
Xu Zheng
Zihao Dongfang
Lutao Jiang
Boyuan Zheng
Yulong Guo
...
L. Zhang
Danda Pani Paudel
Nicu Sebe
Luc Van Gool
Xuming Hu
LRM
VLM
848
12
0
29 Oct 2025
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
Xiaoyu Zhu
Hao Zhou
Pengfei Xing
Long Zhao
Hao Xu
Junwei Liang
Alex Hauptmann
Ting Liu
Andrew C. Gallagher
DiffM
414
13
0
18 Jul 2024
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024
Daizong Liu
Yang Liu
Wencan Huang
Wei Hu
LM&Ro
425
36
0
09 Jun 2024
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding
International Conference on Learning Representations (ICLR), 2023
Eslam Mohamed Bakr
Mohamed Ayman
Mahmoud Ahmed
Habib Slim
Mohamed Elhoseiny
LRM
452
16
0
10 Oct 2023
Multi3DRefer: Grounding Text Description to Multiple 3D Objects
IEEE International Conference on Computer Vision (ICCV), 2023
Yiming Zhang
ZeMing Gong
Angel X. Chang
549
153
0
11 Sep 2023
3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zehan Wang
Haifeng Huang
Yang Zhao
Lin Li
Xize Cheng
Yichen Zhu
Aoxiong Yin
Zhou Zhao
3DPC
228
35
0
25 Jul 2023
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Yang Zhao
Zhijie Lin
Daquan Zhou
Zilong Huang
Jiashi Feng
Bingyi Kang
MLLM
284
129
0
17 Jul 2023
OpenMask3D: Open-Vocabulary 3D Instance Segmentation
Neural Information Processing Systems (NeurIPS), 2023
Ayca Takmaz
Elisabetta Fedele
R. Sumner
Marc Pollefeys
F. Tombari
Francis Engelmann
ISeg
VLM
309
292
0
23 Jun 2023
Fine-Grained Visual Prompting
Neural Information Processing Systems (NeurIPS), 2023
Lingfeng Yang
Yueze Wang
Xiang Li
Xinlong Wang
Jian Yang
ObjD
VLM
291
116
0
07 Jun 2023
What does CLIP know about a red circle? Visual prompt engineering for VLMs
IEEE International Conference on Computer Vision (ICCV), 2023
Aleksandar Shtedritski
Christian Rupprecht
Andrea Vedaldi
VLM
MLLM
502
255
0
13 Apr 2023
Segment Anything
IEEE International Conference on Computer Vision (ICCV), 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
1.1K
12,789
0
05 Apr 2023
NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations
Computer Vision and Pattern Recognition (CVPR), 2023
Joy Hsu
Jiayuan Mao
Jiajun Wu
PINN
331
78
0
23 Mar 2023
A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
Junjie Ye
Xuanting Chen
Nuo Xu
Can Zu
Zekai Shao
...
Jie Zhou
Siming Chen
Tao Gui
Tao Gui
Xuanjing Huang
ELM
349
469
0
18 Mar 2023
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
European Conference on Computer Vision (ECCV), 2023
Shilong Liu
Zhaoyang Zeng
Tianhe Ren
Feng Li
Hao Zhang
...
Chun-yue Li
Jianwei Yang
Hang Su
Jun Zhu
Lei Zhang
ObjD
913
3,820
0
09 Mar 2023
Directed Diffusion: Direct Control of Object Placement through Attention Guidance
AAAI Conference on Artificial Intelligence (AAAI), 2023
W. Ma
J. P. Lewis
Avisek Lahiri
Thomas Leung
W. Kleijn
DiffM
513
86
0
25 Feb 2023
ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved Visio-Linguistic Models in 3D Scenes
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Ahmed Abdelreheem
Kyle Olszewski
Hsin-Ying Lee
Peter Wonka
Panos Achlioptas
3DPC
317
35
0
12 Dec 2022
Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding
Neural Information Processing Systems (NeurIPS), 2022
Eslam Mohamed Bakr
Yasmeen Alsaedy
Mohamed Elhoseiny
3DPC
246
62
0
25 Nov 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Neural Information Processing Systems (NeurIPS), 2022
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
339
145
0
17 Nov 2022
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Computer Vision and Pattern Recognition (CVPR), 2022
Yanmin Wu
Xinhua Cheng
Renrui Zhang
Zesen Cheng
Jian Zhang
414
119
0
29 Sep 2022
3D Instances as 1D Kernels
European Conference on Computer Vision (ECCV), 2022
Yizhe Wu
Min Shi
Shuaiyuan Du
Hao Lu
Zhiguo Cao
Weicai Zhong
ISeg
3DPC
255
49
0
15 Jul 2022
Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases
Zhihao Yuan
Xu Yan
Zhuo Li
Xuhao Li
Yao Guo
Shuguang Cui
Zhen Li
287
19
0
05 Jul 2022
GLIPv2: Unifying Localization and Vision-Language Understanding
Haotian Zhang
Pengchuan Zhang
Xiaowei Hu
Yen-Chun Chen
Liunian Harold Li
Xiyang Dai
Lijuan Wang
Lu Yuan
Lei Li
Jianfeng Gao
ObjD
VLM
368
371
0
12 Jun 2022
Large Language Models are Zero-Shot Reasoners
Neural Information Processing Systems (NeurIPS), 2022
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
1.6K
6,749
0
24 May 2022
3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
Computer Vision and Pattern Recognition (CVPR), 2022
Jun-Bin Luo
Jiahui Fu
Xianghao Kong
Chen Gao
Haibing Ren
Hao Shen
Huaxia Xia
Si Liu
366
135
0
13 Apr 2022
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Sanjay Subramanian
William Merrill
Trevor Darrell
Matt Gardner
Sameer Singh
Anna Rohrbach
ObjD
335
169
0
12 Apr 2022
Multi-View Transformer for 3D Visual Grounding
Computer Vision and Pattern Recognition (CVPR), 2022
Shijia Huang
Yilun Chen
Jiaya Jia
Liwei Wang
461
191
0
05 Apr 2022
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
International Conference on Learning Representations (ICLR), 2022
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
ViT
1.0K
2,529
0
07 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Neural Information Processing Systems (NeurIPS), 2022
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
2.7K
17,183
0
28 Jan 2022
High-Resolution Image Synthesis with Latent Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
DiffM
4.7K
23,580
0
20 Dec 2021
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds
Ayush Jain
N. Gkanatsios
Ishita Mediratta
Katerina Fragkiadaki
ObjD
602
159
0
16 Dec 2021
Grounded Language-Image Pre-training
Liunian Harold Li
Pengchuan Zhang
Haotian Zhang
Jianwei Yang
Chunyuan Li
...
Lu Yuan
Lei Zhang
Lei Li
Kai-Wei Chang
Jianfeng Gao
ObjD
VLM
582
1,561
0
07 Dec 2021
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLM
VPVLM
VLM
707
251
0
24 Sep 2021
TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding
ACM Multimedia (ACM MM), 2021
Dailan He
Yusheng Zhao
Junyu Luo
Tianrui Hui
Shaofei Huang
Aixi Zhang
Si Liu
ViT
416
125
0
05 Aug 2021
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
2.7K
8,889
0
07 Jul 2021
SAT: 2D Semantics Assisted Training for 3D Visual Grounding
IEEE International Conference on Computer Vision (ICCV), 2021
Zhengyuan Yang
Songyang Zhang
Liwei Wang
Jiebo Luo
3DPC
478
165
0
24 May 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
IEEE International Conference on Computer Vision (ICCV), 2021
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjD
VLM
687
1,101
0
26 Apr 2021
InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring
IEEE International Conference on Computer Vision (ICCV), 2021
Zhihao Yuan
Xu Yan
Yinghong Liao
Ruimao Zhang
Sheng Wang
Zhen Li
Shuguang Cui
372
185
0
01 Mar 2021
End-to-End Object Detection with Transformers
European Conference on Computer Vision (ECCV), 2020
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
3.1K
17,593
0
26 May 2020
PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
Computer Vision and Pattern Recognition (CVPR), 2020
Li Jiang
Hengshuang Zhao
Shaoshuai Shi
Shu Liu
Chi-Wing Fu
Jiaya Jia
3DPC
412
555
0
03 Apr 2020
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
European Conference on Computer Vision (ECCV), 2019
Dave Zhenyu Chen
Angel X. Chang
Matthias Nießner
3DPC
542
549
0
18 Dec 2019
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Neural Information Processing Systems (NeurIPS), 2019
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
1.1K
50,986
0
03 Dec 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
6.0K
29,143
0
26 Jul 2019
Deep Hough Voting for 3D Object Detection in Point Clouds
C. Qi
Or Litany
Kaiming He
Leonidas Guibas
3DPC
732
1,460
0
21 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
3.1K
112,756
0
11 Oct 2018
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
LM&Ro
1.1K
1,683
0
20 Nov 2017
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
Neural Information Processing Systems (NeurIPS), 2017
C. Qi
L. Yi
Hao Su
Leonidas Guibas
3DPC
3DV
868
13,667
0
07 Jun 2017
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Computer Vision and Pattern Recognition (CVPR), 2017
Angela Dai
Angel X. Chang
Manolis Savva
Maciej Halber
Thomas Funkhouser
Matthias Nießner
3DPC
3DV
1.5K
5,260
0
14 Feb 2017
Adam: A Method for Stochastic Optimization
International Conference on Learning Representations (ICLR), 2014
Diederik P. Kingma
Jimmy Ba
ODL
5.0K
164,701
0
22 Dec 2014
1
Page 1 of 1