Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.14760
Cited By
Can 3D Vision-Language Models Truly Understand Natural Language?
21 March 2024
Weipeng Deng
Jihan Yang
Runyu Ding
Jiahui Liu
Yijiang Li
Xiaojuan Qi
Edith Ngai
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can 3D Vision-Language Models Truly Understand Natural Language?"
11 / 11 papers shown
Title
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
J. Huang
Baoxiong Jia
Y. Wang
Ziyu Zhu
Xiongkun Linghu
Qing Li
Song-Chun Zhu
Siyuan Huang
75
3
0
28 Mar 2025
Do large language vision models understand 3D shapes?
Sagi Eppel
3DV
81
1
0
14 Dec 2024
MC-LLaVA: Multi-Concept Personalized Vision-Language Model
Ruichuan An
Sihan Yang
Ming Lu
Kai Zeng
Yulin Luo
...
Hao Liang
Qi She
Shanghang Zhang
W. Zhang
Wentao Zhang
76
5
0
18 Nov 2024
Can OOD Object Detectors Learn from Foundation Models?
Jiahui Liu
Xin Wen
Shizhen Zhao
Y. Chen
Xiaojuan Qi
OODD
31
2
0
08 Sep 2024
V-IRL: Grounding Virtual Intelligence in Real Life
Jihan Yang
Runyu Ding
Ellis L Brown
Xiaojuan Qi
Saining Xie
LM&Ro
48
18
0
05 Feb 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas J. Guibas
3DH
3DPC
3DV
PINN
219
13,886
0
02 Dec 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,458
0
06 Jun 2016
1