A Formally Verified Robustness Certifier for Neural Networks (Extended Version)

Neural networks are often susceptible to minor perturbations in input that cause them to misclassify. A recent solution to this problem is the use of globally-robust neural networks, which employ a function to certify that the classification of an input cannot be altered by such a perturbation. Outputs that pass this test are called certified robust. However, to the authors' knowledge, these certification functions have not yet been verified at the implementation level. We demonstrate how previous unverified implementations are exploitably unsound in certain circumstances. Moreover, they often rely on approximation-based algorithms, such as power iteration, that (perhaps surprisingly) do not guarantee soundness. To provide assurance that a given output is robust, we implemented and formally verified a certification function for globally-robust neural networks in Dafny. We describe the program, its specifications, and the important design decisions taken for its implementation and verification, as well as our experience applying it in practice.
View on arXiv@article{tobler2025_2505.06958, title={ A Formally Verified Robustness Certifier for Neural Networks (Extended Version) }, author={ James Tobler and Hira Taqdees Syeda and Toby Murray }, journal={arXiv preprint arXiv:2505.06958}, year={ 2025 } }