318

The Sparse Hausdorff Moment Problem, with Application to Topic Models

Abstract

We consider the problem of identifying, from its first mm noisy moments, a probability distribution on [0,1][0,1] of support k<k<\infty. This is equivalent to the problem of learning a distribution on mm observable binary random variables X1,X2,,XmX_1,X_2,\dots,X_m that are iid conditional on a hidden random variable UU taking values in {1,2,,k}\{1,2,\dots,k\}. Our focus is on accomplishing this with m=2km=2k, which is the minimum mm for which verifying that the source is a kk-mixture is possible (even with exact statistics). This problem, so simply stated, is quite useful: e.g., by a known reduction, any algorithm for it lifts to an algorithm for learning pure topic models. In past work on this and also the more general mixture-of-products problem (XiX_i independent conditional on UU, but not necessarily iid), a barrier at mO(k2)m^{O(k^2)} on the sample complexity and/or runtime of the algorithm was reached. We improve this substantially. We show it suffices to use a sample of size exp(klogk)\exp(k\log k) (with m=2km=2k). It is known that the sample complexity of any solution to the identification problem must be exp(Ω(k))\exp(\Omega(k)). Stated in terms of the moment problem, it suffices to know the moments to additive accuracy exp(klogk)\exp(-k\log k). Our run-time for the moment problem is only O(k2+o(1))O(k^{2+o(1)}) arithmetic operations.

View on arXiv
Comments on this paper