2
0

Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation

Abstract

Multi-modal semantic segmentation (MMSS) faces significant challenges in real-world scenarios due to dynamic environments, sensor failures, and noise interference, creating a gap between theoretical models and practical performance. To address this, we propose a two-stage framework called RobustSeg, which enhances multi-modal robustness through two key components: the Hybrid Prototype Distillation Module (HPDM) and the Representation Regularization Module (RRM). In the first stage, RobustSeg pre-trains a multi-modal teacher model using complete modalities. In the second stage, a student model is trained with random modality dropout while learning from the teacher via HPDM and RRM. HPDM transforms features into compact prototypes, enabling cross-modal hybrid knowledge distillation and mitigating bias from missing modalities. RRM reduces representation discrepancies between the teacher and student by optimizing functional entropy through the log-Sobolev inequality. Extensive experiments on three public benchmarks demonstrate that RobustSeg outperforms previous state-of-the-art methods, achieving improvements of +2.76%, +4.56%, and +0.98%, respectively. Code is available at:this https URL.

View on arXiv
@article{tan2025_2505.12861,
  title={ Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation },
  author={ Jiaqi Tan and Xu Zheng and Yang Liu },
  journal={arXiv preprint arXiv:2505.12861},
  year={ 2025 }
}
Comments on this paper