424

Measuring \ell_\infty Attacks by the 2\ell_2 Norm

Abstract

Deep Neural Networks (DNNs) could be easily fooled by Adversarial Examples (AEs) with the imperceptible difference to original samples in human eyes. To keep the difference imperceptible, the existing attacking bound the adversarial perturbations by the \ell_\infty norm, which is then served as the standard to align different attacks for a fair comparison. However, when investigating attack transferability, i.e., the capability of the AEs from attacking one surrogate DNN to cheat other black-box DNN, we find that only using the \ell_\infty norm is not sufficient to measure the attack strength, according to our comprehensive experiments concerning 7 transfer-based attacks, 4 white-box surrogate models, and 9 black-box victim models. Specifically, we find that the 2\ell_2 norm greatly affects the transferability in \ell_\infty attacks. Since larger-perturbed AEs naturally bring about better transferability, we advocate that the strength of all attacks should be measured by both the widely used \ell_\infty and also the 2\ell_2 norm. Despite the intuitiveness of our conclusion and advocacy, they are very necessary for the community, because common evaluations (bounding only the \ell_\infty norm) allow tricky enhancements of the "attack transferability" by increasing the "attack strength" (2\ell_2 norm) as shown by our simple counter-example method, and the good transferability of several existing methods may be due to their large 2\ell_2 distances.

View on arXiv
Comments on this paper