RAT: Boosting Misclassification Detection Ability without Extra Data

18 March 2025

Ge Yan

Tsui-Wei Weng

AAML

ArXiv (abs)PDF HTML

Main:8 Pages

7 Figures

Bibliography:2 Pages

7 Tables

Appendix:3 Pages

Abstract

As deep neural networks(DNN) become increasingly prevalent, particularly in high-stakes areas such as autonomous driving and healthcare, the ability to detect incorrect predictions of models and intervene accordingly becomes crucial for safety. In this work, we investigate the detection of misclassified inputs for image classification models from the lens of adversarial perturbation: we propose to use robust radius (a.k.a. input-space margin) as a confidence metric and design two efficient estimation algorithms, RR-BS and RR-Fast, for misclassification detection. Furthermore, we design a training method called Radius Aware Training (RAT) to boost models' ability to identify mistakes. Extensive experiments show our method could achieve up to 29.3% reduction on AURC and 21.62% reduction in FPR@95TPR, compared with previous methods.

View on arXiv

Comments on this paper