Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2506.03135
Cited By

OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models

v1v2 (latest)

OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models

3 June 2025

ArXiv (abs)PDF HTML HuggingFace (37 upvotes)

Papers citing "OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models"

18 / 18 papers shown

Reasoning Path and Latent State Analysis for Multi-view Visual Spatial Reasoning: A Cognitive Science Perspective

Reasoning Path and Latent State Analysis for Multi-view Visual Spatial Reasoning: A Cognitive Science Perspective

93

0

0

02 Dec 2025

Geometrically-Constrained Agent for Spatial Reasoning

Geometrically-Constrained Agent for Spatial Reasoning

121

0

0

27 Nov 2025

DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action

DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action

Shanghang Zhang

112

3

0

27 Nov 2025

G$^2$VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

^2

VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

295

0

0

26 Nov 2025

Video2Layout: Recall and Reconstruct Metric-Grounded Cognitive Map for Spatial Reasoning

205

0

0

20 Nov 2025

FlexiCup: Wireless Multimodal Suction Cup with Dual-Zone Vision-Tactile Sensing

FlexiCup: Wireless Multimodal Suction Cup with Dual-Zone Vision-Tactile Sensing

...

153

2

0

18 Nov 2025

Spatial Reasoning in Multimodal Large Language Models: A Survey of Tasks, Benchmarks and Methods

117

1

0

14 Nov 2025

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

202

14

0

30 Oct 2025

Pelican-VL 1.0: A Foundation Brain Model for Embodied Intelligence

Pelican-VL 1.0: A Foundation Brain Model for Embodied Intelligence

...

160

1

0

30 Oct 2025

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

...

Danda Pani Paudel

731

5

0

29 Oct 2025

Seeing Across Views: Benchmarking Spatial Reasoning of Vision-Language Models in Robotic Scenes

Seeing Across Views: Benchmarking Spatial Reasoning of Vision-Language Models in Robotic Scenes

...

160

1

0

22 Oct 2025

Spatial-DISE: A Unified Benchmark for Evaluating Spatial Reasoning in Vision-Language Models

Spatial-DISE: A Unified Benchmark for Evaluating Spatial Reasoning in Vision-Language Models

Guangliang Cheng

282

0

0

15 Oct 2025

Automated Repeatable Adversary Threat Emulation with Effects Language (EL)

Automated Repeatable Adversary Threat Emulation with Effects Language (EL)

Suresh Damodaran

141

10

0

07 Oct 2025

How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective

How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective

...

324

12

0

23 Sep 2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

...

306

298

0

25 Aug 2025

Holistic Evaluation of Multimodal LLMs on Spatial Intelligence

Holistic Evaluation of Multimodal LLMs on Spatial Intelligence

...

272

0

0

18 Aug 2025

DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge

DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge

...

230

51

0

06 Jul 2025

Positional Prompt Tuning for Efficient 3D Representation Learning

Positional Prompt Tuning for Efficient 3D Representation Learning

403

10

0

21 Aug 2024

Page 1 of 1