Efficient Unbiased Sparsification

An unbiased -sparsification of a vector is a random vector with mean that has at most nonzero coordinates. Unbiased sparsification compresses the original vector without introducing bias; it arises in various contexts, such as in federated learning and sampling sparse probability distributions. Ideally, unbiased sparsification should also minimize the expected value of a divergence function that measures how far away is from the original . If is optimal in this sense, then we call it efficient. Our main results describe efficient unbiased sparsifications for divergences that are either permutation-invariant or additively separable. Surprisingly, the characterization for permutation-invariant divergences is robust to the choice of divergence function, in the sense that our class of optimal for squared Euclidean distance coincides with our class of optimal for Kullback-Leibler divergence, or indeed any of a wide variety of divergences.
View on arXiv