339

Robust Mean Estimation under Coordinate-level Corruption with Missing Entries

International Conference on Machine Learning (ICML), 2020
Abstract

We study the problem of robust mean estimation and introduce a novel Hamming distance-based measure of distribution shift for coordinate-level corruptions. We show that this measure yields adversary models that capture more realistic corruptions than those used in prior works, and present an information-theoretic analysis of robust mean estimation techniques in these settings. We show that for structured distributions, methods that leverage the structure yield more accurate mean estimation. Finally, we introduce a novel two-step meta-algorithm for robust mean estimation that first fixes corruptions in the input data and then performs robust mean estimation. We demonstrate in real-world data with missing values that our two-step approach outperforms existing robust estimation methods and provides accurate mean estimation even in high-magnitude corruption settings.

View on arXiv
Comments on this paper