ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.23087
22
0

Statistical-Computational Trade-offs for Density Estimation

30 October 2024
Anders Aamand
Alexandr Andoni
Justin Y. Chen
Piotr Indyk
Shyam Narayanan
Sandeep Silwal
Haike Xu
ArXiv (abs)PDFHTML
Abstract

We study the density estimation problem defined as follows: given kkk distributions p1,…,pkp_1, \ldots, p_kp1​,…,pk​ over a discrete domain [n][n][n], as well as a collection of samples chosen from a ``query'' distribution qqq over [n][n][n], output pip_ipi​ that is ``close'' to qqq. Recently~\cite{aamand2023data} gave the first and only known result that achieves sublinear bounds in {\em both} the sampling complexity and the query time while preserving polynomial data structure space. However, their improvement over linear samples and time is only by subpolynomial factors. Our main result is a lower bound showing that, for a broad class of data structures, their bounds cannot be significantly improved. In particular, if an algorithm uses O(n/log⁡ck)O(n/\log^c k)O(n/logck) samples for some constant c>0c>0c>0 and polynomial space, then the query time of the data structure must be at least k1−O(1)/log⁡log⁡kk^{1-O(1)/\log \log k}k1−O(1)/loglogk, i.e., close to linear in the number of distributions kkk. This is a novel \emph{statistical-computational} trade-off for density estimation, demonstrating that any data structure must use close to a linear number of samples or take close to linear query time. The lower bound holds even in the realizable case where q=piq=p_iq=pi​ for some iii, and when the distributions are flat (specifically, all distributions are uniform over half of the domain [n][n][n]). We also give a simple data structure for our lower bound instance with asymptotically matching upper bounds. Experiments show that the data structure is quite efficient in practice.

View on arXiv
Comments on this paper