151

Differentially Private Space-Efficient Algorithms for Counting Distinct Elements in the Turnstile Model

Main:20 Pages
Bibliography:4 Pages
Appendix:17 Pages
Abstract

The turnstile continual release model of differential privacy captures scenarios where a privacy-preserving real-time analysis is sought for a dataset evolving through additions and deletions. In typical applications of real-time data analysis, both the length of the stream TT and the size of the universe U|U| from which data come can be extremely large. This motivates the study of private algorithms in the turnstile setting using space sublinear in both TT and U|U|. In this paper, we give the first sublinear space differentially private algorithms for the fundamental problem of counting distinct elements in the turnstile streaming model. Our algorithm achieves, on arbitrary streams, O~η(T1/3)\tilde{O}_{\eta}(T^{1/3}) space and additive error, and a (1+η)(1+\eta)-relative approximation for all η(0,1)\eta \in (0,1). Our result significantly improves upon the space requirements of the state-of-the-art algorithms for this problem, which is linear, approaching the known Ω(T1/4)\Omega(T^{1/4}) additive error lower bound for arbitrary streams. Moreover, when a bound WW on the number of times an item appears in the stream is known, our algorithm provides O~η(W)\tilde{O}_{\eta}(\sqrt{W}) additive error, using O~η(W)\tilde{O}_{\eta}(\sqrt{W}) space. This additive error asymptotically matches that of prior work which required instead linear space. Our results address an open question posed by [Jain, Kalemaj, Raskhodnikova, Sivakumar, Smith, Neurips23] about designing low-memory mechanisms for this problem. We complement these results with a space lower bound for this problem, which shows that any algorithm that uses similar techniques must use space Ω~(T1/3)\tilde{\Omega}(T^{1/3}) on arbitrary streams.

View on arXiv
Comments on this paper