24
0

Measuring the Transferability of \ell_\infty Attacks by the 2\ell_2 Norm

Abstract

Deep neural networks could be fooled by adversarial examples with trivial differences to original samples. To keep the difference imperceptible in human eyes, researchers bound the adversarial perturbations by the \ell_\infty norm, which is now commonly served as the standard to align the strength of different attacks for a fair comparison. However, we propose that using the \ell_\infty norm alone is not sufficient in measuring the attack strength, because even with a fixed \ell_\infty distance, the 2\ell_2 distance also greatly affects the attack transferability between models. Through the discovery, we reach more in-depth understandings towards the attack mechanism, i.e., several existing methods attack black-box models better partly because they craft perturbations with 70% to 130% larger 2\ell_2 distances. Since larger perturbations naturally lead to better transferability, we thereby advocate that the strength of attacks should be simultaneously measured by both the \ell_\infty and 2\ell_2 norm. Our proposal is firmly supported by extensive experiments on ImageNet dataset from 7 attacks, 4 white-box models, and 9 black-box models.

View on arXiv
Comments on this paper