ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2509.02359
  4. Cited By
Why Do MLLMs Struggle with Spatial Understanding? A Systematic Analysis from Data to Architecture

Why Do MLLMs Struggle with Spatial Understanding? A Systematic Analysis from Data to Architecture

2 September 2025
Wanyue Zhang
Yibin Huang
Yangbin Xu
JingJing Huang
Helu Zhi
Shuo Ren
Wang Xu
Jiajun Zhang
    LRM
ArXiv (abs)PDFHTML

Papers citing "Why Do MLLMs Struggle with Spatial Understanding? A Systematic Analysis from Data to Architecture"

9 / 9 papers shown
Video2Layout: Recall and Reconstruct Metric-Grounded Cognitive Map for Spatial Reasoning
Yibin Huang
Wang Xu
Wanyue Zhang
Helu Zhi
JingJing Huang
Yangbin Xu
Yangang Sun
Conghui Zhu
Tiejun Zhao
198
0
0
20 Nov 2025
Spatial Reasoning in Multimodal Large Language Models: A Survey of Tasks, Benchmarks and Methods
Weichen Liu
Qiyao Xue
Haoming Wang
Xiangyu Yin
Boyuan Yang
Wei Gao
111
1
0
14 Nov 2025
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning
Yuhong Liu
Beichen Zhang
Yuhang Zang
Yuhang Cao
Long Xing
Xiaoyi Dong
Haodong Duan
Dahua Lin
J. Wang
LRM
196
4
0
31 Oct 2025
Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization
Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization
Nikita Kachaev
Mikhail Kolosov
Daniil Zelezetsky
A. Kovalev
Aleksandr I. Panov
VLM
320
2
0
29 Oct 2025
Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks
Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks
Xu Zheng
Zihao Dongfang
Lutao Jiang
Boyuan Zheng
Yulong Guo
...
L. Zhang
Danda Pani Paudel
Nicu Sebe
Luc Van Gool
Xuming Hu
LRMVLM
713
4
0
29 Oct 2025
DynaSolidGeo: A Dynamic Benchmark for Genuine Spatial Mathematical Reasoning of VLMs in Solid Geometry
DynaSolidGeo: A Dynamic Benchmark for Genuine Spatial Mathematical Reasoning of VLMs in Solid Geometry
Changti Wu
Shijie Lian
Zihao Liu
Lei Zhang
Laurence Tianruo Yang
Kai Chen
AIMat
439
0
0
25 Oct 2025
Learning GUI Grounding with Spatial Reasoning from Visual Feedback
Learning GUI Grounding with Spatial Reasoning from Visual Feedback
Yu Zhao
Wei Chen
Huseyin A. Inan
Samuel Kessler
Lu Wang
...
Fangkai Yang
Chaoyun Zhang
Pasquale Minervini
Saravan Rajmohan
Robert Sim
124
1
0
25 Sep 2025
Vision Language Models Are Not (Yet) Spelling Correctors
Vision Language Models Are Not (Yet) Spelling Correctors
Junhong Liang
Bojun Zhang
VLM
59
0
0
22 Sep 2025
Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models
Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models
Pu Jian
Junhong Wu
Wei Sun
Chen Wang
Shuo Ren
Jiajun Zhang
LRM
135
5
0
15 Sep 2025
1