LangDAug: Langevin Data Augmentation for Multi-Source Domain Generalization in Medical Image Segmentation

26 May 2025

Piyush Tiwary

Kinjawl Bhattacharyya

Prathosh A.P.

MedIm

ArXiv (abs)PDF HTML

Main:9 Pages

21 Figures

Bibliography:5 Pages

5 Tables

Appendix:22 Pages

Abstract

Medical image segmentation models often struggle to generalize across different domains due to various reasons. Domain Generalization (DG) methods overcome this either through representation learning or data augmentation (DAug). While representation learning methods seek domain-invariant features, they often rely on ad-hoc techniques and lack formal guarantees. DAug methods, which enrich model representations through synthetic samples, have shown comparable or superior performance to representation learning approaches. We propose LangDAug, a novel $\textbf{Lang}$ evin $\textbf{D}$ ata $\textbf{Aug}$ mentation for multi-source domain generalization in 2D medical image segmentation. LangDAug leverages Energy-Based Models (EBMs) trained via contrastive divergence to traverse between source domains, generating intermediate samples through Langevin dynamics. Theoretical analysis shows that LangDAug induces a regularization effect, and for GLMs, it upper-bounds the Rademacher complexity by the intrinsic dimensionality of the data manifold. Through extensive experiments on Fundus segmentation and 2D MRI prostate segmentation benchmarks, we show that LangDAug outperforms state-of-the-art domain generalization methods and effectively complements existing domain-randomization approaches. The codebase for our method is available atthis https URL.

View on arXiv

Comments on this paper