On the Generalization of Adversarially Trained Quantum Classifiers

24 April 2025

Abstract

Quantum classifiers are vulnerable to adversarial attacks that manipulate their input classical or quantum data. A promising countermeasure is adversarial training, where quantum classifiers are trained by using an attack-aware, adversarial loss function. This work establishes novel bounds on the generalization error of adversarially trained quantum classifiers when tested in the presence of perturbation-constrained adversaries. The bounds quantify the excess generalization error incurred to ensure robustness to adversarial attacks as scaling with the training sample size $m$ as $1/\sqrt{m}$ , while yielding insights into the impact of the quantum embedding. For quantum binary classifiers employing \textit{rotation embedding}, we find that, in the presence of adversarial attacks on classical inputs $\mathbf{x}$ , the increase in sample complexity due to adversarial training over conventional training vanishes in the limit of high dimensional inputs $\mathbf{x}$ . In contrast, when the adversary can directly attack the quantum state $\rho(\mathbf{x})$ encoding the input $\mathbf{x}$ , the excess generalization error depends on the choice of embedding only through its Hilbert space dimension. The results are also extended to multi-class classifiers. We validate our theoretical findings with numerical experiments.

View on arXiv

@article{georgiou2025_2504.17690,
  title={ On the Generalization of Adversarially Trained Quantum Classifiers },
  author={ Petros Georgiou and Aaron Mark Thomas and Sharu Theresa Jose and Osvaldo Simeone },
  journal={arXiv preprint arXiv:2504.17690},
  year={ 2025 }
}

Comments on this paper