ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.05456
22
2

SITE: towards Spatial Intelligence Thorough Evaluation

8 May 2025
W. Wang
Reuben Tan
Pengyue Zhu
Jianwei Yang
Zhengyuan Yang
Lijuan Wang
Andrey Kolobov
Jianfeng Gao
Boqing Gong
ArXivPDFHTML
Abstract

Spatial intelligence (SI) represents a cognitive ability encompassing the visualization, manipulation, and reasoning about spatial relationships, underpinning disciplines from neuroscience to robotics. We introduce SITE, a benchmark dataset towards SI Thorough Evaluation in a standardized format of multi-choice visual question-answering, designed to assess large vision-language models' spatial intelligence across diverse visual modalities (single-image, multi-image, and video) and SI factors (figural to environmental scales, spatial visualization and orientation, intrinsic and extrinsic, static and dynamic). Our approach to curating the benchmark combines a bottom-up survey about 31 existing datasets and a top-down strategy drawing upon three classification systems in cognitive science, which prompt us to design two novel types of tasks about view-taking and dynamic scenes. Extensive experiments reveal that leading models fall behind human experts especially in spatial orientation, a fundamental SI factor. Moreover, we demonstrate a positive correlation between a model's spatial reasoning proficiency and its performance on an embodied AI task.

View on arXiv
@article{wang2025_2505.05456,
  title={ SITE: towards Spatial Intelligence Thorough Evaluation },
  author={ Wenqi Wang and Reuben Tan and Pengyue Zhu and Jianwei Yang and Zhengyuan Yang and Lijuan Wang and Andrey Kolobov and Jianfeng Gao and Boqing Gong },
  journal={arXiv preprint arXiv:2505.05456},
  year={ 2025 }
}
Comments on this paper