Near-optimal algorithms for private estimation and sequential testing of collision probability

Abstract
We present new algorithms for estimating and testing \emph{collision probability}, a fundamental measure of the spread of a discrete distribution that is widely used in many scientific fields. We describe an algorithm that satisfies -local differential privacy and estimates collision probability with error at most using samples for , which improves over previous work by a factor of . We also present a sequential testing algorithm for collision probability, which can distinguish between collision probability values that are separated by using samples, even when is unknown. Our algorithms have nearly the optimal sample complexity, and in experiments we show that they require significantly fewer samples than previous methods.
View on arXiv@article{busa-fekete2025_2504.13804, title={ Near-optimal algorithms for private estimation and sequential testing of collision probability }, author={ Robert Busa-Fekete and Umar Syed }, journal={arXiv preprint arXiv:2504.13804}, year={ 2025 } }
Comments on this paper