419

Representation Disentaglement via Regularization by Identification

Abstract

This work focuses on the problem of learning disentangled representations from observational data. Given observations x(i){\mathbf{x}^{(i)}} for $i=1,...,N $ drawn from p(xy)p(\mathbf{x}|\mathbf{y}) with generative variables y\mathbf{y} admitting the distribution factorization p(y)=cp(yc)p(\mathbf{y}) = \prod_{c} p(\mathbf{y}_c ) we ask whether learning disentangled representations matching the space of observations with identification guarantees on the posterior p(zx,y^c)p(\mathbf{z}| \mathbf{x}, \hat{\mathbf{y}}_c) for each cc, is plausible. We argue modern deep representation learning models are ill-posed with collider bias behaviour; a source of bias producing entanglement between generating variables. Under the rubric of causality, we show this issue can be explained and reconciled under the condition of identifiability; attainable under supervision or a weak-form of it. For this, we propose regularization by identification (ReI), a regularization framework defined by the identification of the causal queries involved in the learning problem. Empirical evidence shows that enforcing ReI in a variational framework results in disentangled representations equipped with generalization capabilities to out-of-distribution examples and that aligns nicely with the true expected effect between generating variables and measurement apparatus.

View on arXiv
Comments on this paper