ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.2314
78
30
v1v2v3v4 (latest)

ℓp\ell_pℓp​ Testing and Learning of Discrete Distributions

7 December 2014
Bo Waggoner
ArXiv (abs)PDFHTML
Abstract

The classic problems of testing uniformity of and learning a discrete distribution, given access to independent samples from it, are examined under general ℓp\ell_pℓp​ metrics. The intuitions and results often contrast with the classic ℓ1\ell_1ℓ1​ case. For p>1p > 1p>1, we can learn and test with a number of samples that is independent of the support size of the distribution: With an ℓp\ell_pℓp​ tolerance ϵ\epsilonϵ, O(max⁡{1/ϵq,1/ϵ2})O(\max\{ \sqrt{1/\epsilon^q}, 1/\epsilon^2 \})O(max{1/ϵq​,1/ϵ2}) samples suffice for testing uniformity and O(max⁡{1/ϵq,1/ϵ2})O(\max\{ 1/\epsilon^q, 1/\epsilon^2\})O(max{1/ϵq,1/ϵ2}) samples suffice for learning, where q=p/(p−1)q=p/(p-1)q=p/(p−1) is the conjugate of ppp. As this parallels the intuition that O(n)O(\sqrt{n})O(n​) and O(n)O(n)O(n) samples suffice for the ℓ1\ell_1ℓ1​ case, it seems that 1/ϵq1/\epsilon^q1/ϵq acts as an upper bound on the "apparent" support size. For some ℓp\ell_pℓp​ metrics, uniformity testing becomes easier over larger supports: a 6-sided die requires fewer trials to test for fairness than a 2-sided coin, and a card-shuffler requires fewer trials than the die. In fact, this inverse dependence on support size holds if and only if p>43p > \frac{4}{3}p>34​. The uniformity testing algorithm simply thresholds the number of "collisions" or "coincidences" and has an optimal sample complexity up to constant factors for all 1≤p≤21 \leq p \leq 21≤p≤2. Another algorithm gives order-optimal sample complexity for ℓ∞\ell_{\infty}ℓ∞​ uniformity testing. Meanwhile, the most natural learning algorithm is shown to have order-optimal sample complexity for all ℓp\ell_pℓp​ metrics. The author thanks Cl\'{e}ment Canonne for discussions and contributions to this work.

View on arXiv
Comments on this paper