Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

18 December 2024

Papers citing "Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces"

3 / 3 papers shown

Title
Grounding Task Assistance with Multimodal Cues from a Single Demonstration Gabriel Sarch Balasaravanan Thoravi Kumaravel Sahithya Ravi Vibhav Vineet A. D. Wilson 15 0 0 02 May 2025
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D Jiahui Zhang Yurui Chen Yanpeng Zhou Yueming Xu Ze Huang ... Xinyue Cai G. Huang Xingyue Quan Hang Xu Li Zhang LRM 77 0 0 29 Mar 2025
How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game Z. Wang Yurui Dong Fuwen Luo Minyuan Ruan Zhili Cheng C. L. P. Chen Peng Li Yang Liu LRM 66 0 0 13 Mar 2025