Existing vision-based 3D occupancy prediction methods are inherently limited in accuracy due to their exclusive reliance on street-view imagery, neglecting the potential benefits of incorporating satellite views. We propose SA-Occ, the first Satellite-Assisted 3D occupancy prediction model, which leverages GPS & IMU to integrate historical yet readily available satellite imagery into real-time applications, effectively mitigating limitations of ego-vehicle perceptions, involving occlusions and degraded performance in distant regions. To address the core challenges of cross-view perception, we propose: 1) Dynamic-Decoupling Fusion, which resolves inconsistencies in dynamic regions caused by the temporal asynchrony between satellite and street views; 2) 3D-Proj Guidance, a module that enhances 3D feature extraction from inherently 2D satellite imagery; and 3) Uniform Sampling Alignment, which aligns the sampling density between street and satellite views. Evaluated on Occ3D-nuScenes, SA-Occ achieves state-of-the-art performance, especially among single-frame methods, with a 39.05% mIoU (a 6.97% improvement), while incurring only 6.93 ms of additional latency per frame. Our code and newly curated dataset are available atthis https URL.
View on arXiv@article{chen2025_2503.16399, title={ SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World }, author={ Chen Chen and Zhirui Wang and Taowei Sheng and Yi Jiang and Yundu Li and Peirui Cheng and Luning Zhang and Kaiqiang Chen and Yanfeng Hu and Xue Yang and Xian Sun }, journal={arXiv preprint arXiv:2503.16399}, year={ 2025 } }