Robust Deep Learning via Reverse Cross-Entropy Training and Thresholding Test

2 June 2017

Abstract

Though the recent progress is substantial, deep learning methods can be vulnerable to the elaborately crafted adversarial samples. In this paper, we attempt to improve the robustness by presenting a new training procedure and a thresholding test strategy. In training, we propose to minimize the reverse cross-entropy, which encourages a deep network to learn latent representations that better distinguish adversarial samples from normal ones. In testing, we propose to use a thresholding strategy based on a new metric to filter out adversarial samples for reliable predictions. Our method is simple to implement using standard algorithms, with little extra training cost compared to the common cross-entropy minimization. We apply our method to various state-of-the-art networks (e.g., residual networks) and we achieve significant improvements on robust predictions in the adversarial setting.

View on arXiv

Comments on this paper