Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2403.13438
Cited By
v1
v2
v3
v4 (latest)
SpatialPIN: Enhancing Spatial Reasoning Capabilities of Vision-Language Models through Prompting and Interacting 3D Priors
Neural Information Processing Systems (NeurIPS), 2024
18 March 2024
Chenyang Ma
Kai Lu
Ta-Ying Cheng
Niki Trigoni
Andrew Markham
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SpatialPIN: Enhancing Spatial Reasoning Capabilities of Vision-Language Models through Prompting and Interacting 3D Priors"
24 / 24 papers shown
SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL
Siyi Chen
Mikaela Angelina Uy
Chan Hee Song
Faisal Ladhak
Adithyavairavan Murali
Qing Qu
Stan Birchfield
Valts Blukis
Jonathan Tremblay
OffRL
LRM
151
0
0
03 Dec 2025
Geometrically-Constrained Agent for Spatial Reasoning
Zeren Chen
Xiaoya Lu
Zhijie Zheng
Pengrui Li
Lehan He
Yijin Zhou
Jing Shao
Bohan Zhuang
Lu Sheng
LRM
103
0
0
27 Nov 2025
Scenes as Tokens: Multi-Scale Normal Distributions Transform Tokenizer for General 3D Vision-Language Understanding
Yutao Tang
Cheng Zhao
Gaurav Mittal
Rohith Kukkala
Rama Chellappa
Cheng-Fang Peng
Mei Chen
VLM
145
0
0
26 Nov 2025
G
2
^2
2
VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
Wenbo Hu
Jingli Lin
Yilin Long
Yunlong Ran
Lihan Jiang
Y. Wang
Chenming Zhu
Runsen Xu
Tai Wang
Jiangmiao Pang
VLM
288
0
0
26 Nov 2025
SpatialGeo:Boosting Spatial Reasoning in Multimodal LLMs via Geometry-Semantics Fusion
Jiajie Guo
Qingpeng Zhu
Jin Zeng
Xiaolong Wu
Changyong He
Weida Wang
LRM
223
0
0
21 Nov 2025
BOP-ASK: Object-Interaction Reasoning for Vision-Language Models
V. Bhat
Sungsu Kim
Valts Blukis
Greg Heinrich
Prashanth Krishnamurthy
Ramesh Karri
Stan Birchfield
Farshad Khorrami
Jonathan Tremblay
VLM
239
1
0
20 Nov 2025
Video2Layout: Recall and Reconstruct Metric-Grounded Cognitive Map for Spatial Reasoning
Yibin Huang
Wang Xu
Wanyue Zhang
Helu Zhi
JingJing Huang
Yangbin Xu
Yangang Sun
Conghui Zhu
Tiejun Zhao
201
0
0
20 Nov 2025
SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards
Hunar Batra
Haoqin Tu
Hardy Chen
Yuanze Lin
Cihang Xie
Ronald Clark
OffRL
ReLM
LRM
374
0
0
10 Nov 2025
iFlyBot-VLM Technical Report
Xin Nie
Zhiyuan Cheng
Yuan Zhang
Chao Ji
Jiajia wu
Yuhan Zhang
Jia Pan
LM&Ro
331
0
0
07 Nov 2025
An Evaluation of Interleaved Instruction Tuning on Semantic Reasoning Performance in an Audio MLLM
Jiawei Liu
Enis Berk Çoban
Zarina Schevchenko
Hao Tang
Zhigang Zhu
Michael I. Mandel
Johanna Devaney
AuLLM
LRM
272
0
0
04 Nov 2025
Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks
Xu Zheng
Zihao Dongfang
Lutao Jiang
Boyuan Zheng
Yulong Guo
...
L. Zhang
Danda Pani Paudel
Nicu Sebe
Luc Van Gool
Xuming Hu
LRM
VLM
721
4
0
29 Oct 2025
COOPERA: Continual Open-Ended Human-Robot Assistance
Chenyang Ma
Kai Lu
Ruta Desai
Xavier Puig
Andrew Markham
Niki Trigoni
140
2
0
27 Oct 2025
Pursuing Minimal Sufficiency in Spatial Reasoning
Yejie Guo
Yunzhong Hou
Wufei Ma
Meng Tang
Ming-Hsuan Yang
LRM
100
0
0
19 Oct 2025
TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Yi Han
Cheng Chi
Enshen Zhou
Shanyu Rong
Jingkun An
Pengwei Wang
Zhongyuan Wang
Lu Sheng
Shanghang Zhang
LRM
239
8
0
08 Oct 2025
Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy
Haijier Chen
Bo Xu
Shoujian Zhang
Haoze Liu
Jiaxuan Lin
Jingrong Wang
LRM
148
1
0
29 Sep 2025
Vision-Language Models as Differentiable Semantic and Spatial Rewards for Text-to-3D Generation
Weimin Bai
Yubo Li
Weijian Luo
Wenzheng Chen
He Sun
179
3
0
19 Sep 2025
3D Aware Region Prompted Vision Language Model
A. Cheng
Yang Fu
Yukang Chen
Zhijian Liu
X. Li
...
Jan Kautz
Pavlo Molchanov
Hongxu Yin
Xiaolong Wang
Sifei Liu
139
8
0
16 Sep 2025
Spatial Reasoning with Vision-Language Models in Ego-Centric Multi-View Scenes
Mohsen Gholami
A. Rezaei
Zhou Weimin
Sitong Mao
Shunbo Zhou
Yong Zhang
Mohammad Akbari
LRM
204
17
0
08 Sep 2025
Retrieval-Augmented Defense: Adaptive and Controllable Jailbreak Prevention for Large Language Models
Guangyu Yang
Jinghong Chen
Jingbiao Mei
Weizhe Lin
Bill Byrne
AAML
139
0
0
22 Aug 2025
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs
Yifan Shen
Yuanzhe Liu
Jingyuan Zhu
Xu Cao
Xiaofeng Zhang
Yixiao He
Wenming Ye
James M. Rehg
Ismini Lourentzou
LRM
161
3
0
26 Jun 2025
AntiGrounding: Lifting Robotic Actions into VLM Representation Space for Decision Making
Wenbo Li
Shiyi Wang
Yiteng Chen
Huiping Zhuang
Qingyao Wu
317
0
0
14 Jun 2025
OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models
Mengdi Jia
Zekun Qi
Shaochen Zhang
Wenyao Zhang
Xinqiang Yu
Jiawei He
He Wang
L. Yi
LRM
VLM
330
28
0
03 Jun 2025
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Zekun Qi
Wenyao Zhang
Yufei Ding
Runpei Dong
Xinqiang Yu
...
Xin Jin
Kaisheng Ma
Zhizheng Zhang
He Wang
Li Yi
LM&Ro
445
15
0
18 Feb 2025
Gradient-less Federated Gradient Boosting Trees with Learnable Learning Rates
Chenyang Ma
Xinchi Qiu
Daniel J. Beutel
Nicholas D. Lane
FedML
240
21
0
15 Apr 2023
1