213
v1v2 (latest)

Revisiting Invariant Learning for Out-of-Domain Generalization on Multi-Site Mammogram Datasets

Main:3 Pages
2 Figures
Bibliography:2 Pages
Abstract

Achieving health equity in Artificial Intelligence (AI) requires diagnostic models that maintain reliability across diverse populations. However, breast cancer screening systems frequently suffer from domain overfitting, degrading significantly when deployed to varying demographics. While Invariant Learning algorithms aim to mitigate this by suppressing site-specific correlations, their efficacy in medical imaging remains underexplored. This study comprehensively evaluates domain generalization techniques for mammography.We constructed a multi-source training environment aggregating datasets from the United States (CBIS-DDSM, EMBED), Portugal (INbreast, BCDR), and Cyprus (BMCD). To assess global generalizability, we evaluated performance on unseen cohorts from Egypt (CDD-CESM) and Sweden (CSAW-CC). We benchmarked Invariant Risk Minimization (IRM) and Variance Risk Extrapolation (VREx) against a rigorously optimized Empirical Risk Minimization (ERM) baseline. Contrary to expectations, standard ERM consistently outperformed specialized invariant mechanisms on out-of-domain testing. While VREx showed potential in stabilizing attention maps, invariant objectives proved unstable and prone to underfitting. We conclude that engineering equitable AI is currently best served by maximizing multi-national data diversity rather than relying on complex algorithmic invariance.

View on arXiv
Comments on this paper