f-Divergence Variational Inference

28 September 2020

Abstract

This paper introduces the $f$ -divergence variational inference ( $f$ -VI) that generalizes variational inference to all $f$ -divergences. Initiated from minimizing a crafty surrogate $f$ -divergence that shares the statistical consistency with the $f$ -divergence, the $f$ -VI framework not only unifies a number of existing VI methods, e.g. Kullback-Leibler VI, R\'{e}nyi's $\alpha$ -VI, and $\chi$ -VI, but offers a standardized toolkit for VI subject to arbitrary divergences from $f$ -divergence family. A general $f$ -variational bound is derived and provides a sandwich estimate of marginal likelihood (or evidence). The development of the $f$ -VI unfolds with a stochastic optimization scheme that utilizes the reparameterization trick, importance weighting and Monte Carlo approximation; a mean-field approximation scheme that generalizes the well-known coordinate ascent variational inference (CAVI) is also proposed for $f$ -VI. Empirical examples, including variational autoencoders and Bayesian neural networks, are provided to demonstrate the effectiveness and the wide applicability of $f$ -VI.

View on arXiv

Comments on this paper