19
1

Efficient Unbiased Sparsification

Abstract

An unbiased mm-sparsification of a vector pRnp\in \mathbb{R}^n is a random vector QRnQ\in \mathbb{R}^n with mean pp that has at most m<nm<n nonzero coordinates. Unbiased sparsification compresses the original vector without introducing bias; it arises in various contexts, such as in federated learning and sampling sparse probability distributions. Ideally, unbiased sparsification should also minimize the expected value of a divergence function Div(Q,p)\mathsf{Div}(Q,p) that measures how far away QQ is from the original pp. If QQ is optimal in this sense, then we call it efficient. Our main results describe efficient unbiased sparsifications for divergences that are either permutation-invariant or additively separable. Surprisingly, the characterization for permutation-invariant divergences is robust to the choice of divergence function, in the sense that our class of optimal QQ for squared Euclidean distance coincides with our class of optimal QQ for Kullback-Leibler divergence, or indeed any of a wide variety of divergences.

View on arXiv
Comments on this paper