60
0

MGNiceNet: Unified Monocular Geometric Scene Understanding

Abstract

Monocular geometric scene understanding combines panoptic segmentation and self-supervised depth estimation, focusing on real-time application in autonomous vehicles. We introduce MGNiceNet, a unified approach that uses a linked kernel formulation for panoptic segmentation and self-supervised depth estimation. MGNiceNet is based on the state-of-the-art real-time panoptic segmentation method RT-K-Net and extends the architecture to cover both panoptic segmentation and self-supervised monocular depth estimation. To this end, we introduce a tightly coupled self-supervised depth estimation predictor that explicitly uses information from the panoptic path for depth prediction. Furthermore, we introduce a panoptic-guided motion masking method to improve depth estimation without relying on video panoptic segmentation annotations. We evaluate our method on two popular autonomous driving datasets, Cityscapes and KITTI. Our model shows state-of-the-art results compared to other real-time methods and closes the gap to computationally more demanding methods. Source code and trained models are available atthis https URL.

View on arXiv
@article{schön2025_2411.11466,
  title={ MGNiceNet: Unified Monocular Geometric Scene Understanding },
  author={ Markus Schön and Michael Buchholz and Klaus Dietmayer },
  journal={arXiv preprint arXiv:2411.11466},
  year={ 2025 }
}
Comments on this paper