300
v1v2v3 (latest)

The Sparse Hausdorff Moment Problem, with Application to Topic Models

Abstract

We consider the problem of identifying, from its first mm noisy moments, a probability distribution on [0,1][0,1] of support k<k<\infty. This is equivalent to the problem of learning a distribution on mm observable binary random variables X1,X2,,XmX_1,X_2,\dots,X_m that are iid conditional on a hidden random variable UU taking values in {1,2,,k}\{1,2,\dots,k\}. Our focus is on accomplishing this with m=2km=2k, which is the minimum mm for which verifying that the source is a kk-mixture is possible (even with exact statistics). This problem, so simply stated, is quite useful: e.g., by a known reduction, any algorithm for it lifts to an algorithm for learning pure topic models. We give an algorithm for identifying a kk-mixture using samples of m=2km=2k iid binary random variables using a sample of size (1/wmin)2(1/ζ)O(k)\left(1/w_{\min}\right)^2 \cdot\left(1/\zeta\right)^{O(k)} and post-sampling runtime of only O(k2+o(1))O(k^{2+o(1)}) arithmetic operations. Here wminw_{\min} is the minimum probability of an outcome of UU, and ζ\zeta is the minimum separation between the distinct success probabilities of the XiX_is. Stated in terms of the moment problem, it suffices to know the moments to additive accuracy wminζO(k)w_{\min}\cdot\zeta^{O(k)}. It is known that the sample complexity of any solution to the identification problem must be at least exponential in kk. Previous results demonstrated either worse sample complexity and worse O(kc)O(k^c) runtime for some cc substantially larger than 22, or similar sample complexity and much worse kO(k2)k^{O(k^2)} runtime.

View on arXiv
Comments on this paper