Breaking the Madry Defense Model with $L_1$ -based Adversarial Examples

International Conference on Learning Representations (ICLR), 2017

30 October 2017

Abstract

The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal $L_\infty$ distortion $\epsilon$ = 0.3. This discourages the use of attacks which are not optimized on the $L_\infty$ distortion metric. Our experimental results demonstrate that by using the elastic-net attack to deep neural networks (EAD), one can generate transferable adversarial examples which, despite their high average $L_\infty$ distortion, have minimal visual distortion. These results call into question the use of $L_\infty$ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples.

View on arXiv

Comments on this paper

Breaking the Madry Defense Model with L1L_1L1​-based Adversarial Examples

Breaking the Madry Defense Model with $L_1$ -based Adversarial Examples