The Sparse Hausdorff Moment Problem, with Application to Topic Models

16 July 2020

Abstract

We consider the problem of identifying, from its first $m$ noisy moments, a probability distribution on $[0,1]$ of support $k<\infty$ . This is equivalent to the problem of learning a distribution on $m$ observable binary random variables $X_1,X_2,\dots,X_m$ that are iid conditional on a hidden random variable $U$ taking values in $\{1,2,\dots,k\}$ . Our focus is on accomplishing this with $m=2k$ , which is the minimum $m$ for which verifying that the source is a $k$ -mixture is possible (even with exact statistics). This problem, so simply stated, is quite useful: e.g., by a known reduction, any algorithm for it lifts to an algorithm for learning pure topic models. In past work on this and also the more general mixture-of-products problem ( $X_i$ independent conditional on $U$ , but not necessarily iid), a barrier at $m^{O(k^2)}$ on the sample complexity and/or runtime of the algorithm was reached. We improve this substantially. We show it suffices to use a sample of size $\exp(k\log k)$ (with $m=2k$ ). It is known that the sample complexity of any solution to the identification problem must be $\exp(\Omega(k))$ . Stated in terms of the moment problem, it suffices to know the moments to additive accuracy $\exp(-k\log k)$ . Our run-time for the moment problem is only $O(k^{2+o(1)})$ arithmetic operations.

View on arXiv

Comments on this paper