SpatialLM: Training Large Language Models for Structured Indoor Modeling
- 3DV

Main:9 Pages
20 Figures
Bibliography:5 Pages
12 Tables
Appendix:10 Pages
Abstract
SpatialLM is a large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. These outputs include architectural elements like walls, doors, windows, and oriented object boxes with their semantic categories. Unlike previous methods which exploit task-specific network designs, our model adheres to the standard multimodal LLM architecture and is fine-tuned directly from open-source LLMs.
View on arXivComments on this paper
