v1v2v3 (latest)

How Learning Dynamics Drive Adversarially Robust Generalization?

10 October 2024

Yuelin Xu

Xiao Zhang

AAML

ArXiv (abs)PDF HTML Github

Main:10 Pages

6 Figures

Bibliography:3 Pages

Appendix:17 Pages

Abstract

Despite being widely adopted as a canonical framework for learning robust models, adversarial training suffers from robust overfitting. Existing empirical measures and theoretical explorations are insufficient to provide satisfying mechanistic insights into the phenomenon. By viewing adversarial training with momentum SGD as a discrete-time dynamical system, we introduce a PAC-Bayesian analytical framework that proves time-resolved robust generalization bounds. Specifically, our framework tracks the closed-form evolution of the posterior mean and covariance under both stationary and non-stationary transient regimes, revealing their connections to the learning rate, the geometry of the loss landscape, and mini-batch stochastic gradients. By empirically approximating the statistical quantities implied by our theory, we offer a unified, mechanistic explanation for robust overfitting. We also illustrate why adversarial weight perturbation reduces the robust generalization gap by suppressing the loss curvature, but its design may be suboptimal for optimization due to over-penalization.

View on arXiv

Comments on this paper