447

Natural Reweighted Wake-Sleep

Abstract

Helmholtz Machines (HMs) are a class of generative models composed of two Sigmoid Belief Networks (SBNs), acting as an encoder and a decoder. These models are commonly trained using a two-step optimization algorithm called Wake-Sleep (WS) and more recently by improved versions, such as Reweighted Wake-Sleep (RWS) and Bidirectional Helmholtz Machines (BiHM). The locality of the connections in an SBN induces sparsity in the Fisher information matrix associated to the model, in the form of a finely-grained block-diagonal structure. In this paper we exploit this property to efficiently train SBNs and HMs using the natural gradient. We present a novel algorithm called Natural Reweighted Wake-Sleep (NRWS), which corresponds to a geometric adaptation of the Reweighted Wake-Sleep, where, differently from most of the previous work, the natural gradient is computed without the need of introducing any approximation of the structure of the Fisher information matrix. The experiments performed on standard datasets from the literature show a consistent improvement of NRWS not only with respect to its non-geometric baseline but also with respect to state-of-the-art training algorithms for HMs. The improvement is quantified both in terms of speed of convergence as well as value of the log-likelihood reached after training.

View on arXiv
Comments on this paper