Yet another but more efficient black-box adversarial attack: tiling and evolution strategies

5 October 2019

Abstract

We introduce a new black-box attack achieving state of the art performances. Our approach is based on a new objective function, borrowing ideas from $\ell_\infty$ -white box attacks, and particularly designed to fit derivative-free optimization requirements. It only requires to have access to the logits of the classifier without any other information which is a more realistic scenario. Not only we introduce a new objective function, we extend previous works on black box adversarial attacks to a larger spectrum of evolution strategies and other derivative-free optimization methods. We also highlight a new intriguing property that deep neural networks are not robust to single shot tiled attacks. Our models achieve, with a budget limited to $10,000$ queries, results up to $99.2\%$ of success rate against InceptionV3 classifier with $630$ queries to the network on average in the untargeted attacks setting, which is an improvement by $90$ queries of the current state of the art. In the targeted setting, we are able to reach, with a limited budget of $100,000$ , $100\%$ of success rate with a budget of $6,662$ queries on average, i.e. we need $800$ queries less than the current state of the art.

View on arXiv

Comments on this paper