Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2506.05318
Cited By
v1
v2 (latest)
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs
5 June 2025
Haoyuan Li
Yanpeng Zhou
Yufei Gao
Tao Tang
J. N. Han
Yujie Yuan
Dave Zhenyu Chen
Jiawang Bian
Hang Xu
Xiaodan Liang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs"
3 / 3 papers shown
Title
Seeing Across Views: Benchmarking Spatial Reasoning of Vision-Language Models in Robotic Scenes
Zhiyuan Feng
Zhaolu Kang
Qijie Wang
Zhiying Du
Jiongrui Yan
...
Shawn Chen
Sicheng Xu
Yaobo Liang
Jiaolong Yang
B. Guo
56
0
0
22 Oct 2025
Where, Not What: Compelling Video LLMs to Learn Geometric Causality for 3D-Grounding
Yutong Zhong
VGen
44
0
0
19 Oct 2025
Reasoning in Space via Grounding in the World
Yiming Chen
Zekun Qi
Wenyao Zhang
Xin Jin
Li Zhang
Peidong Liu
LRM
73
0
0
15 Oct 2025
1