Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.12311
Cited By
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent
21 September 2023
Jianing Yang
Xuweiyi Chen
Shengyi Qian
Nikhil Madaan
Madhavan Iyengar
David Fouhey
Joyce Chai
LM&Ro
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent"
50 / 78 papers shown
Title
3DWG: 3D Weakly Supervised Visual Grounding via Category and Instance-Level Alignment
X. Li
J. H. Liu
Nuowei Han
Liang Heng
Y. Guo
Hao Dong
Yang Liu
46
0
0
03 May 2025
SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models
Nader Zantout
Haochen Zhang
Pujith Kachana
J. Qiu
Ji Zhang
Wenshan Wang
LM&Ro
LRM
47
0
0
25 Apr 2025
Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D
Sergio Arnaud
Paul Mcvay
Ada Martin
Arjun Majumdar
Krishna Murthy Jatavallabhula
...
Nicolas Ballas
Mido Assran
Oleksandr Maksymets
Aravind Rajeswaran
Franziska Meier
3DPC
36
0
0
19 Apr 2025
Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions
He Zhu
Quyu Kong
Kechun Xu
Xunlong Xia
Bing Deng
Jieping Ye
R. Xiong
Y. Wang
25
0
0
07 Apr 2025
ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning
Zhenyang Liu
Yikai Wang
Sixiao Zheng
Tongying Pan
Longfei Liang
Yanwei Fu
Xiangyang Xue
LRM
49
0
0
30 Mar 2025
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
J. Huang
Baoxiong Jia
Y. Wang
Ziyu Zhu
Xiongkun Linghu
Qing Li
Song-Chun Zhu
Siyuan Huang
75
3
0
28 Mar 2025
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation
Jiaxin Huang
Runnan Chen
Ziwen Li
Zhengqing Gao
Xiao He
Yandong Guo
M. Gong
Tongliang Liu
LRM
44
0
0
23 Mar 2025
RAIDER: Tool-Equipped Large Language Model Agent for Robotic Action Issue Detection, Explanation and Recovery
Silvia Izquierdo-Badiola
Carlos Rizzo
Guillem Alenyà
LLMAG
LM&Ro
69
0
0
22 Mar 2025
Exploring 3D Activity Reasoning and Planning: From Implicit Human Intentions to Route-Aware Planning
Xueying Jiang
Wenhao Li
Xiaoqin Zhang
Ling Shao
Shijian Lu
LRM
37
0
0
17 Mar 2025
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning
Hanxun Yu
Wentong Li
Song Wang
J. Chen
Jianke Zhu
3DV
LRM
71
3
0
01 Mar 2025
FoREST: Frame of Reference Evaluation in Spatial Reasoning Tasks
Tanawan Premsri
Parisa Kordjamshidi
40
1
0
25 Feb 2025
Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection
Boyu Mi
Hanqing Wang
Tai Wang
Yilun Chen
Jiangmiao Pang
67
0
0
21 Feb 2025
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Yue Zhang
Ziqiao Ma
Jialu Li
Yanyuan Qiao
Zun Wang
J. Chai
Qi Wu
Mohit Bansal
Parisa Kordjamshidi
LRM
51
17
0
31 Dec 2024
Multi-Agent Planning Using Visual Language Models
Michele Brienza
F. Argenziano
Vincenzo Suriani
D. Bloisi
Daniele Nardi
LM&Ro
LLMAG
59
4
0
31 Dec 2024
3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning
Yuncong Yang
Han Yang
Jiachen Zhou
Peihao Chen
Hongxin Zhang
Yilun Du
Chuang Gan
61
0
0
23 Nov 2024
VLN-Game: Vision-Language Equilibrium Search for Zero-Shot Semantic Navigation
Bangguo Yu
Yuzhen Liu
Lei Han
H. Kasaei
Tingguang Li
M. Cao
LM&Ro
64
2
0
18 Nov 2024
The Future of Intelligent Healthcare: A Systematic Analysis and Discussion on the Integration and Impact of Robots Using Large Language Models for Healthcare
Souren Pashangpour
Goldie Nejat
LM&MA
42
7
0
05 Nov 2024
Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention
Haomeng Zhang
Chiao-An Yang
Raymond A. Yeh
26
1
0
29 Oct 2024
SceneGenAgent: Precise Industrial Scene Generation with Coding Agent
Xiao Xia
Dan Zhang
Zibo Liao
Zhenyu Hou
Tianrui Sun
Jing Li
Ling Fu
Yuxiao Dong
LM&Ro
3DV
LLMAG
AI4CE
28
0
0
29 Oct 2024
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
Runsen Xu
Zhiwei Huang
Tai Wang
Y. Chen
Jiangmiao Pang
Dahua Lin
VGen
26
0
0
17 Oct 2024
From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating UI Operation Impacts
Zhuohao Jerry Zhang
E. Schoop
Jeffrey Nichols
Anuj Mahajan
Amanda Swearngin
LLMAG
18
1
0
11 Oct 2024
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and Open-Vocabulary Semantic Scene Graphs
Venkata Naren Devarakonda
Raktim Gautam Goswami
Ali Umut Kaypak
Naman Patel
Rooholla Khorrambakht
P. Krishnamurthy
Farshad Khorrami
LM&Ro
27
3
0
08 Oct 2024
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models
Yue Zhang
Zhiyang Xu
Ying Shen
Parisa Kordjamshidi
Lifu Huang
24
6
0
04 Oct 2024
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
Kaizhi Zheng
Xiaotong Chen
Xuehai He
Jing Gu
Linjie Li
Zhengyuan Yang
Kevin Qinghong Lin
Jianfeng Wang
Lijuan Wang
Xin Eric Wang
KELM
DiffM
33
0
0
03 Oct 2024
Grounding 3D Scene Affordance From Egocentric Interactions
Cuiyu Liu
Wei Zhai
Yuhang Yang
Hongchen Luo
Sen Liang
Yang Cao
Zheng-Jun Zha
18
1
0
29 Sep 2024
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Chenming Zhu
Tai Wang
Wenwei Zhang
Jiangmiao Pang
Xihui Liu
84
29
0
26 Sep 2024
ChatCam: Empowering Camera Control through Conversational AI
Xinhang Liu
Yu-Wing Tai
Chi-Keung Tang
VGen
18
2
0
25 Sep 2024
Semantics-Controlled Gaussian Splatting for Outdoor Scene Reconstruction and Rendering in Virtual Reality
Hannah Schieber
Jacob Young
Tobias Langlotz
Stefanie Zollmann
Daniel Roth
3DGS
18
0
0
24 Sep 2024
Multi-modal Situated Reasoning in 3D Scenes
Xiongkun Linghu
Jiangyong Huang
Xuesong Niu
Xiaojian Ma
Baoxiong Jia
Siyuan Huang
18
11
0
04 Sep 2024
EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online Grounding and Execution
F. Argenziano
Michele Brienza
Vincenzo Suriani
Daniele Nardi
D. Bloisi
LM&Ro
33
0
0
30 Aug 2024
DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors
Zizheng Yan
Jiapeng Zhou
Fanpeng Meng
Yushuang Wu
Lingteng Qiu
Zisheng Ye
Shuguang Cui
Guanying Chen
Xiaoguang Han
DiffM
26
4
0
23 Jul 2024
OpenSU3D: Open World 3D Scene Understanding using Foundation Models
Rafay Mohiuddin
Sai Manoj Prakhya
Fiona Collins
Ziyuan Liu
André Borrmann
26
2
0
19 Jul 2024
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Y. Liu
Weixing Chen
Yongjie Bai
Xiaodan Liang
Guanbin Li
Wen Gao
Liang Lin
LM&Ro
SyDa
AI4CE
48
27
0
09 Jul 2024
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
Chenming Zhu
Tai Wang
Wenwei Zhang
Kai Chen
Xihui Liu
ReLM
LRM
45
16
0
01 Jul 2024
LLM-Craft: Robotic Crafting of Elasto-Plastic Objects with Large Language Models
Alison Bartsch
A. Farimani
70
6
0
12 Jun 2024
Situational Awareness Matters in 3D Vision Language Reasoning
Yunze Man
Liang-Yan Gui
Yu-Xiong Wang
30
12
0
11 Jun 2024
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions
Daizong Liu
Yang Liu
Wencan Huang
Wei Hu
LM&Ro
23
9
0
09 Jun 2024
VP-LLM: Text-Driven 3D Volume Completion with Large Language Models through Patchification
Jianmeng Liu
Yichen Liu
Yuyao Zhang
Zeyuan Meng
Yu-Wing Tai
Chi-Keung Tang
32
0
0
08 Jun 2024
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing Yang
Xuweiyi Chen
Nikhil Madaan
Madhavan Iyengar
Shengyi Qian
David Fouhey
Joyce Chai
3DV
61
11
0
07 Jun 2024
DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences
Yidong Huang
Jacob Sansom
Ziqiao Ma
Felix Gervits
Joyce Chai
33
17
0
05 Jun 2024
Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models
Tianrun Chen
Chunan Yu
Jing Li
Jianqi Zhang
Lanyun Zhu
Deyi Ji
Yong Zhang
Ying-Dong Zang
Zejian Li
Lingyun Sun
LRM
36
9
0
29 May 2024
Safety Control of Service Robots with LLMs and Embodied Knowledge Graphs
Yong Qi
Gabriel Kyebambo
Siyuan Xie
Wei Shen
Shenghui Wang
Bitao Xie
Bin He
Zhipeng Wang
Shuo Jiang
21
2
0
28 May 2024
LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding
Haoyu Zhao
Wenhang Ge
Ying-cong Chen
ObjD
MLLM
VLM
21
1
0
27 May 2024
Grounded 3D-LLM with Referent Tokens
Yilun Chen
Shuai Yang
Haifeng Huang
Tai Wang
Ruiyuan Lyu
Runsen Xu
Dahua Lin
Jiangmiao Pang
45
22
0
16 May 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
29
11
0
16 May 2024
Transcrib3D: 3D Referring Expression Resolution through Large Language Models
Jiading Fang
Xiangshan Tan
Shengjie Lin
Igor Vasiljevic
Vitor Campagnolo Guizilini
Hongyuan Mei
Rares Ambrus
Gregory Shakhnarovich
Matthew R. Walter
LM&Ro
25
4
0
30 Apr 2024
Think-Program-reCtify: 3D Situated Reasoning with Large Language Models
Qingrong He
Kejun Lin
Shizhe Chen
Anwen Hu
Qin Jin
LRM
26
1
0
23 Apr 2024
Idea-2-3D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs
Junhao Chen
Xiang Li
Xiaojun Ye
Chao Li
Zhaoxin Fan
Hao Zhao
VGen
3DV
184
4
0
05 Apr 2024
PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model
Amrin Kareem
Jean Lahoud
Hisham Cholakkal
LRM
39
3
0
04 Apr 2024
Language Models are Spacecraft Operators
Victor Rodríguez-Fernández
Alejandro Carrasco
Jason Cheng
Eli Scharf
P. M. Siew
Richard Linares
LM&Ro
LLMAG
33
2
0
30 Mar 2024
1
2
Next