Robust Sparse Mean Estimation via Incremental Learning

24 May 2023

Jianhao Ma

ArXiv (abs)PDF HTML Github (2★)

Main:17 Pages

9 Figures

Bibliography:5 Pages

1 Tables

Appendix:6 Pages

Abstract

In this paper, we study the problem of robust sparse mean estimation, where the goal is to estimate a $k$ -sparse mean from a collection of partially corrupted samples drawn from a heavy-tailed distribution. Existing estimators face two critical challenges in this setting. First, the existing estimators rely on the prior knowledge of the sparsity level $k$ . Second, the existing estimators fall short of practical use as they scale poorly with the ambient dimension. This paper presents a simple mean estimator that overcomes both challenges under moderate conditions: it works without the knowledge of $k$ and runs in near-linear time and memory (both with respect to the ambient dimension). Moreover, provided that the signal-to-noise ratio is large, we can further improve our result to match the information-theoretic lower bound. At the core of our method lies an incremental learning phenomenon: we introduce a simple nonconvex framework that can incrementally learn the top- $k$ nonzero elements of the mean while keeping the zero elements arbitrarily small. Finally, we conduct a series of simulations to corroborate our theoretical findings.

View on arXiv

Comments on this paper