ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.13993
133
4

Identification of Mixtures of Discrete Product Distributions in Near-Optimal Sample and Time Complexity

25 September 2023
Spencer Gordon
Erik Jahn
Bijan Mazaheri
Y. Rabani
Leonard J. Schulman
    CoGe
ArXiv (abs)PDFHTML
Abstract

We consider the problem of identifying, from statistics, a distribution of discrete random variables X1,…,XnX_1,\ldots,X_nX1​,…,Xn​ that is a mixture of kkk product distributions. The best previous sample complexity for n∈O(k)n \in O(k)n∈O(k) was (1/ζ)O(k2log⁡k)(1/\zeta)^{O(k^2 \log k)}(1/ζ)O(k2logk) (under a mild separation assumption parameterized by ζ\zetaζ). The best known lower bound was exp⁡(Ω(k))\exp(\Omega(k))exp(Ω(k)). It is known that n≥2k−1n\geq 2k-1n≥2k−1 is necessary and sufficient for identification. We show, for any n≥2k−1n\geq 2k-1n≥2k−1, how to achieve sample complexity and run-time complexity (1/ζ)O(k)(1/\zeta)^{O(k)}(1/ζ)O(k). We also extend the known lower bound of eΩ(k)e^{\Omega(k)}eΩ(k) to match our upper bound across a broad range of ζ\zetaζ. Our results are obtained by combining (a) a classic method for robust tensor decomposition, (b) a novel way of bounding the condition number of key matrices called Hadamard extensions, by studying their action only on flattened rank-1 tensors.

View on arXiv
Comments on this paper