26
0

Training Safe Neural Networks with Global SDP Bounds

Roman Soletskyi
David "davidad" Dalrymple
Abstract

This paper presents a novel approach to training neural networks with formal safety guarantees using semidefinite programming (SDP) for verification. Our method focuses on verifying safety over large, high-dimensional input regions, addressing limitations of existing techniques that focus on adversarial robustness bounds. We introduce an ADMM-based training scheme for an accurate neural network classifier on the Adversarial Spheres dataset, achieving provably perfect recall with input dimensions up to d=40d=40. This work advances the development of reliable neural network verification methods for high-dimensional systems, with potential applications in safe RL policies.

View on arXiv
Comments on this paper