f-Divergence Variational Inference

This paper introduces the -divergence variational inference (-VI) that generalizes variational inference to all -divergences. Initiated from minimizing a crafty surrogate -divergence that shares the statistical consistency with the -divergence, the -VI framework not only unifies a number of existing VI methods, e.g. Kullback-Leibler VI, R\'{e}nyi's -VI, and -VI, but offers a standardized toolkit for VI subject to arbitrary divergences from -divergence family. A general -variational bound is derived and provides a sandwich estimate of marginal likelihood (or evidence). The development of the -VI unfolds with a stochastic optimization scheme that utilizes the reparameterization trick, importance weighting and Monte Carlo approximation; a mean-field approximation scheme that generalizes the well-known coordinate ascent variational inference (CAVI) is also proposed for -VI. Empirical examples, including variational autoencoders and Bayesian neural networks, are provided to demonstrate the effectiveness and the wide applicability of -VI.
View on arXiv