Provable tradeoffs in adversarially robust classification

9 June 2020

Abstract

It is well known that machine learning methods can be vulnerable to adversarially-chosen perturbations of their inputs. Despite significant progress in the area, foundational open problems remain. Here we address several of these key questions. We derive exact and approximate Bayes-optimal robust classifiers for the important setting of two- and three-class Gaussian classification problems with arbitrary imbalance, for $\ell_2$ and $\ell_\infty$ adversaries. In contrast to classical Bayes-optimal classifiers, decisions here cannot be made pointwise and new theoretical approaches are needed. We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry (Cianci et al, 2011, Mossel and Neeman 2015), which, to our knowledge, have not yet been used in the area. Our results reveal tradeoffs between standard and robust accuracy that grow when data is imbalanced. We also show further foundational results, including an analysis of the loss landscape, classification calibration for convex losses in certain models, and finite sample rates for the robust risk.

View on arXiv

Comments on this paper