v1v2v3 (latest)

Sparse Mean Estimation in Adversarial Settings via Incremental Learning

24 May 2023

ArXiv (abs)PDF HTML Github (2★)

Main:17 Pages

9 Figures

Bibliography:5 Pages

1 Tables

Appendix:6 Pages

Abstract

In this paper, we study the problem of sparse mean estimation under adversarial corruptions, where the goal is to estimate the $k$ -sparse mean of a heavy-tailed distribution from samples contaminated by adversarial noise. Existing methods face two key limitations: they require prior knowledge of the sparsity level $k$ and scale poorly to high-dimensional settings. We propose a simple and scalable estimator that addresses both challenges. Specifically, it learns the $k$ -sparse mean without knowing $k$ in advance and operates in near-linear time and memory with respect to the ambient dimension. Under a moderate signal-to-noise ratio, our method achieves the optimal statistical rate, matching the information-theoretic lower bound. Extensive simulations corroborate our theoretical guarantees. At the heart of our approach is an incremental learning phenomenon: we show that a basic subgradient method applied to a nonconvex two-layer formulation with an $\ell_1$ -loss can incrementally learn the $k$ nonzero components of the true mean while suppressing the rest. More broadly, our work is the first to reveal the incremental learning phenomenon of the subgradient method in the presence of heavy-tailed distributions and adversarial corruption.

View on arXiv

Comments on this paper