112

A Linear Systems Theory of Normalizing Flows

Abstract

Normalizing Flows are a promising new class of algorithms for unsupervised learning based on maximum likelihood optimization with change of variables. They offer to learn a factorized component representation for complex nonlinear data and, simultaneously, yield a density function that can evaluate likelihoods and generate samples. Despite their potential, Normalizing Flows are curiously unstable to train in many scenarios, and there is little theoretical understanding of the learned representation. We provide a new theoretical perspective of Normalizing Flows using the lens of linear systems theory, showing that optimal flows learn to represent the local covariance at each region of input space. We further show that the training objective is ill-posed for data with low intrinsic dimensionality, explaining why stability issues are frequently encountered. The first is a new regularization technique based on covariance shrinkage that stabilizes the optimization procedure. The second and most important tool is a new algorithm to compute un-whitened latent components with the learned model, a much richer representation than the default, whitened version. Experiments with toy manifold learning datasets, as well as the MNIST image dataset, provide convincing support for our theory and tools.

View on arXiv
Comments on this paper