Sample-wise Adaptive Weighting for Transfer Consistency in Adversarial Distillation

11 December 2025

Hongsin Lee

Hye Won Chung

AAML

ArXiv (abs)PDF HTML Github

Main:10 Pages

7 Figures

Bibliography:5 Pages

28 Tables

Appendix:11 Pages

Abstract

Adversarial distillation in the standard min-max adversarial training framework aims to transfer adversarial robustness from a large, robust teacher network to a compact student. However, existing work often neglects to incorporate state-of-the-art robust teachers. Through extensive analysis, we find that stronger teachers do not necessarily yield more robust students-a phenomenon known as robust saturation. While typically attributed to capacity gaps, we show that such explanations are incomplete. Instead, we identify adversarial transferability-the fraction of student-crafted adversarial examples that remain effective against the teacher-as a key factor in successful robustness transfer. Based on this insight, we propose Sample-wise Adaptive Adversarial Distillation (SAAD), which reweights training examples by their measured transferability without incurring additional computational cost. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet show that SAAD consistently improves AutoAttack robustness over prior methods. Our code is available atthis https URL.

View on arXiv

Comments on this paper