49

Data Mapping for Restricted Boltzmann Machine

Abstract

Restricted Boltzmann machine (RBM) is two-layer neural nets constructed as a probabilistic model and its training is to maximize a product of probabilities by the contrastive divergence (CD) scheme. In this paper a data mapping is used to describe the relationship between visible and hidden layers and the training is to minimize a squared error of the reconstructed visible layer by the gradient descent or a finite difference approximation. This paper presents three new findings: 1) nodes on visible and hidden layers can take real-valued matrix data without a probabilistic interpretation; 2) the famous CD1 is a finite difference approximation of gradient descent after ignoring the second-order error; 3) activation can take non-sigmoid functions such as identity, relu and softsign. The data mapping provides a unified framework on dimensionality reduction, feature extraction and data representation pioneered and developed by Hinton and his colleagues. As an approximation of gradient descent, the finite difference learning is applicable to both directed and undirected graphs. Numerical results are performed to confirm these new findings on very low dimensionality reduction, matrix data and flexible activations.

View on arXiv
Comments on this paper