DA-Occ: Direction-Aware 2D Convolution for Efficient and Geometry-Preserving 3D Occupancy Prediction in Autonomous Driving
Efficient and high-accuracy 3D occupancy prediction is vital for the performance of autonomous driving systems. However, existing methods struggle to balance precision and efficiency: high-accuracy approaches are often hindered by heavy computational overhead, leading to slow inference speeds, while others leverage pure bird's-eye-view (BEV) representations to gain speed at the cost of losing vertical spatial cues and compromising geometric integrity. To overcome these limitations, we build on the efficient Lift-Splat-Shoot (LSS) paradigm and propose a pure 2D framework, DA-Occ, for 3D occupancy prediction that preserves fine-grained geometry. Standard LSS-based methods lift 2D features into 3D space solely based on depth scores, making it difficult to fully capture vertical structure. To improve upon this, DA-Occ augments depth-based lifting with a complementary height-score projection that explicitly encodes vertical geometric information. We further employ direction-aware convolution to extract geometric features along both vertical and horizontal orientations, effectively balancing accuracy and computational efficiency. On the Occ3D-nuScenes, the proposed method achieves an mIoU of 39.3% and an inference speed of 27.7 FPS, effectively balancing accuracy and efficiency. In simulations on edge devices, the inference speed reaches 14.8 FPS, further demonstrating the method's applicability for real-time deployment in resource-constrained environments.
View on arXiv