364
v1v2v3 (latest)

Sparse Mean Estimation in Adversarial Settings via Incremental Learning

Main:17 Pages
9 Figures
Bibliography:5 Pages
1 Tables
Appendix:6 Pages
Abstract

In this paper, we study the problem of sparse mean estimation under adversarial corruptions, where the goal is to estimate the kk-sparse mean of a heavy-tailed distribution from samples contaminated by adversarial noise. Existing methods face two key limitations: they require prior knowledge of the sparsity level kk and scale poorly to high-dimensional settings. We propose a simple and scalable estimator that addresses both challenges. Specifically, it learns the kk-sparse mean without knowing kk in advance and operates in near-linear time and memory with respect to the ambient dimension. Under a moderate signal-to-noise ratio, our method achieves the optimal statistical rate, matching the information-theoretic lower bound. Extensive simulations corroborate our theoretical guarantees. At the heart of our approach is an incremental learning phenomenon: we show that a basic subgradient method applied to a nonconvex two-layer formulation with an 1\ell_1-loss can incrementally learn the kk nonzero components of the true mean while suppressing the rest. More broadly, our work is the first to reveal the incremental learning phenomenon of the subgradient method in the presence of heavy-tailed distributions and adversarial corruption.

View on arXiv
Comments on this paper