17
5

Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples

Abstract

We study the problem of estimating mixtures of Gaussians under the constraint of differential privacy (DP). Our main result is that poly(k,d,1/α,1/ε,log(1/δ))\text{poly}(k,d,1/\alpha,1/\varepsilon,\log(1/\delta)) samples are sufficient to estimate a mixture of kk Gaussians in Rd\mathbb{R}^d up to total variation distance α\alpha while satisfying (ε,δ)(\varepsilon, \delta)-DP. This is the first finite sample complexity upper bound for the problem that does not make any structural assumptions on the GMMs. To solve the problem, we devise a new framework which may be useful for other tasks. On a high level, we show that if a class of distributions (such as Gaussians) is (1) list decodable and (2) admits a "locally small'' cover (Bun et al., 2021) with respect to total variation distance, then the class of its mixtures is privately learnable. The proof circumvents a known barrier indicating that, unlike Gaussians, GMMs do not admit a locally small cover (Aden-Ali et al., 2021b).

View on arXiv
Comments on this paper