Perceptual Deep Neural Networks: Adversarial Robustness through Input Recreation

2 September 2020

Abstract

Adversarial examples have shown that albeit highly accurate, models learned by machines, differently from humans, have many weaknesses. However, humans' perception is also fundamentally different from machines, because we do not see the signals which arrive at the retina but a rather complex recreation of them. In this paper, we explore how machines could recreate the input as well as investigate the benefits of such an augmented perception. In this regard, we propose Perceptual Deep Neural Networks ( $\varphi$ DNN) which also recreate their own input before further processing. The concept is formalized mathematically and two variations of it are developed (one based on inpainting the whole image and the other based on a noisy resized super resolution recreation). Experiments reveal that $\varphi$ DNNs can reduce attacks' accuracy substantially, surpassing state-of-the-art defenses in 92% of the tests for adversarial training variations and 100% of the tests when only comparing with other pre-processing type of defenses. The $\varphi$ DNN based on inpainting is shown to scale well to bigger image sizes, keeping a similar low attack accuracy; while the state-of-the-art worsen up to three times. Moreover, the recreation process intentionally corrupts the input image. Interestingly, we show by ablation tests that corrupting the input is, although counter-intuitive, beneficial. Thus, $\varphi$ DNNs reveal that input recreation has strong benefits for artificial neural networks similar to biological ones, shedding light into the importance of purposely corrupting the input as well as pioneering an area of perception models based on GANs and autoencoders for robust recognition in artificial intelligence.

View on arXiv

Comments on this paper