v1v2 (latest)

Adversarially Domain-adaptive Latent Diffusion for Unsupervised Semantic Segmentation

22 December 2024

Jongmin Yu

Zhongtian Sun

Shan Luo

Jinhong Yang

Shan Luo

DiffM

ArXiv (abs)PDF HTML

Main:8 Pages

6 Figures

Bibliography:3 Pages

3 Tables

Abstract

Semantic segmentation requires extensive pixel-level annotation, motivating unsupervised domain adaptation (UDA) to transfer knowledge from labelled source domains to unlabelled or weakly labelled target domains. One of the most efficient strategies involves using synthetic datasets generated within controlled virtual environments, such as video games or traffic simulators, which can automatically generate pixel-level annotations. However, even when such datasets are available, learning a well-generalised representation that captures both domains remains challenging, owing to probabilistic and geometric discrepancies between the virtual world and real-world imagery. This work introduces a semantic segmentation method based on latent diffusion models, termed Inter-Coder Connected Latent Diffusion (ICCLD), alongside an unsupervised domain adaptation approach. The model employs an inter-coder connection to enhance contextual understanding and preserve fine details, while adversarial learning aligns latent feature distributions across domains during the latent diffusion process. Experiments on GTA5, Synthia, and Cityscapes demonstrate that ICCLD outperforms state-of-the-art UDA methods, achieving mIoU scores of 74.4 (GTA5 $\rightarrow$ Cityscapes) and 67.2 (Synthia $\rightarrow$ Cityscapes).

View on arXiv

Comments on this paper

All Papers

Title