502
v1v2 (latest)

Scalable Out-of-distribution Robustness in the Presence of Unobserved Confounders

International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Main:9 Pages
4 Figures
Bibliography:3 Pages
5 Tables
Appendix:13 Pages
Abstract

We consider the task of out-of-distribution (OOD) generalization, where the distribution shift is due to an unobserved confounder (ZZ) affecting both the covariates (XX) and the labels (YY). This confounding introduces heterogeneity in the predictor, i.e., P(YX)=EP(ZX)[P(YX,Z)]P(Y | X) = E_{P(Z | X)}[P(Y | X,Z)], making traditional covariate and label shift assumptions unsuitable. OOD generalization differs from traditional domain adaptation in that it does not assume access to the covariate distribution (XteX^\text{te}) of the test samples during training. These conditions create a challenging scenario for OOD robustness: (a) ZtrZ^\text{tr} is an unobserved confounder during training, (b) Pte(Z)Ptr(Z)P^\text{te}(Z) \neq P^\text{tr}(Z), (c) XteX^\text{te} is unavailable during training, and (d) the predictive distribution depends on Pte(Z)P^\text{te}(Z). While prior work has developed complex predictors requiring multiple additional variables for identifiability of the latent distribution, we explore a set of identifiability assumptions that yield a surprisingly simple predictor using only a single additional variable. Our approach demonstrates superior empirical performance on several benchmark tasks.

View on arXiv
Comments on this paper