27
0

On the Generalization of Adversarially Trained Quantum Classifiers

Abstract

Quantum classifiers are vulnerable to adversarial attacks that manipulate their input classical or quantum data. A promising countermeasure is adversarial training, where quantum classifiers are trained by using an attack-aware, adversarial loss function. This work establishes novel bounds on the generalization error of adversarially trained quantum classifiers when tested in the presence of perturbation-constrained adversaries. The bounds quantify the excess generalization error incurred to ensure robustness to adversarial attacks as scaling with the training sample size mm as 1/m1/\sqrt{m}, while yielding insights into the impact of the quantum embedding. For quantum binary classifiers employing \textit{rotation embedding}, we find that, in the presence of adversarial attacks on classical inputs x\mathbf{x}, the increase in sample complexity due to adversarial training over conventional training vanishes in the limit of high dimensional inputs x\mathbf{x}. In contrast, when the adversary can directly attack the quantum state ρ(x)\rho(\mathbf{x}) encoding the input x\mathbf{x}, the excess generalization error depends on the choice of embedding only through its Hilbert space dimension. The results are also extended to multi-class classifiers. We validate our theoretical findings with numerical experiments.

View on arXiv
@article{georgiou2025_2504.17690,
  title={ On the Generalization of Adversarially Trained Quantum Classifiers },
  author={ Petros Georgiou and Aaron Mark Thomas and Sharu Theresa Jose and Osvaldo Simeone },
  journal={arXiv preprint arXiv:2504.17690},
  year={ 2025 }
}
Comments on this paper