84
0

Sublinear Variational Optimization of Gaussian Mixture Models with Millions to Billions of Parameters

Abstract

Gaussian Mixture Models (GMMs) range among the most frequently used machine learning models. However, training large, general GMMs becomes computationally prohibitive for datasets with many data points NN of high-dimensionality DD. For GMMs with arbitrary covariances, we here derive a highly efficient variational approximation, which is integrated with mixtures of factor analyzers (MFAs). For GMMs with CC components, our proposed algorithm significantly reduces runtime complexity per iteration from O(NCD2)\mathcal{O}(NCD^2) to a complexity scaling linearly with DD and remaining constant w.r.t. CC. Numerical validation of this theoretical complexity reduction then shows the following: the distance evaluations required for the entire GMM optimization process scale sublinearly with NCNC. On large-scale benchmarks, this sublinearity results in speed-ups of an order-of-magnitude compared to the state-of-the-art. As a proof of concept, we train GMMs with over 10 billion parameters on about 100 million images, and observe training times of approximately nine hours on a single state-of-the-art CPU.

View on arXiv
Comments on this paper