We consider the problem of identifying, from its first noisy moments, a probability distribution on of support . This is equivalent to the problem of learning a distribution on observable binary random variables that are iid conditional on a hidden random variable taking values in . Our focus is on accomplishing this with , which is the minimum for which verifying that the source is a -mixture is possible (even with exact statistics). This problem, so simply stated, is quite useful: e.g., by a known reduction, any algorithm for it lifts to an algorithm for learning pure topic models. In past work on this and also the more general mixture-of-products problem ( independent conditional on , but not necessarily iid), a barrier at on the sample complexity and/or runtime of the algorithm was reached. We improve this substantially. We show it suffices to use a sample of size (with ). It is known that the sample complexity of any solution to the identification problem must be . Stated in terms of the moment problem, it suffices to know the moments to additive accuracy . Our run-time for the moment problem is only arithmetic operations.
View on arXiv