24
0

Near-optimal algorithms for private estimation and sequential testing of collision probability

Abstract

We present new algorithms for estimating and testing \emph{collision probability}, a fundamental measure of the spread of a discrete distribution that is widely used in many scientific fields. We describe an algorithm that satisfies (α,β)(\alpha, \beta)-local differential privacy and estimates collision probability with error at most ϵ\epsilon using O~(log(1/β)α2ϵ2)\tilde{O}\left(\frac{\log(1/\beta)}{\alpha^2 \epsilon^2}\right) samples for α1\alpha \le 1, which improves over previous work by a factor of 1α2\frac{1}{\alpha^2}. We also present a sequential testing algorithm for collision probability, which can distinguish between collision probability values that are separated by ϵ\epsilon using O~(1ϵ2)\tilde{O}(\frac{1}{\epsilon^2}) samples, even when ϵ\epsilon is unknown. Our algorithms have nearly the optimal sample complexity, and in experiments we show that they require significantly fewer samples than previous methods.

View on arXiv
@article{busa-fekete2025_2504.13804,
  title={ Near-optimal algorithms for private estimation and sequential testing of collision probability },
  author={ Robert Busa-Fekete and Umar Syed },
  journal={arXiv preprint arXiv:2504.13804},
  year={ 2025 }
}
Comments on this paper