Why Do MLLMs Struggle with Spatial Understanding? A Systematic Analysis from Data to Architecture

2 September 2025

Papers citing "Why Do MLLMs Struggle with Spatial Understanding? A Systematic Analysis from Data to Architecture"

9 / 9 papers shown

Video2Layout: Recall and Reconstruct Metric-Grounded Cognitive Map for Spatial Reasoning

198

20 Nov 2025

Spatial Reasoning in Multimodal Large Language Models: A Survey of Tasks, Benchmarks and Methods

111

14 Nov 2025

Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning

196

31 Oct 2025

Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization

320

29 Oct 2025

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

...

713

29 Oct 2025

DynaSolidGeo: A Dynamic Benchmark for Genuine Spatial Mathematical Reasoning of VLMs in Solid Geometry

Laurence Tianruo Yang

Kai Chen

AIMat

439

25 Oct 2025

Learning GUI Grounding with Spatial Reasoning from Visual Feedback

...

124

25 Sep 2025

Vision Language Models Are Not (Yet) Spelling Correctors

Junhong Liang

Bojun Zhang

VLM

22 Sep 2025

Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models

135

15 Sep 2025