Training Safe Neural Networks with Global SDP Bounds

15 September 2024

David "davidad" Dalrymple

Abstract

This paper presents a novel approach to training neural networks with formal safety guarantees using semidefinite programming (SDP) for verification. Our method focuses on verifying safety over large, high-dimensional input regions, addressing limitations of existing techniques that focus on adversarial robustness bounds. We introduce an ADMM-based training scheme for an accurate neural network classifier on the Adversarial Spheres dataset, achieving provably perfect recall with input dimensions up to $d=40$ . This work advances the development of reliable neural network verification methods for high-dimensional systems, with potential applications in safe RL policies.

View on arXiv

Comments on this paper