Non-Convex Optimization with Spectral Radius Regularization
- ODL

We develop regularization methods to find flat minima while training deep neural networks. These minima generalize better than sharp minima, yielding models outperforming baselines on real-world test data (which may be distributed differently than the training data). Specifically, we propose a method of regularized optimization to reduce the spectral radius of the Hessian of the loss function. We also derive algorithms to efficiently optimize neural network models and prove that these algorithms almost surely converge. Furthermore, we demonstrate that our algorithm works effectively on applications in different domains, including healthcare. To show that our models generalize well, we introduced various methods for testing generalizability and found that our models outperform comparable baseline models on these tests.
View on arXiv@article{sandler2025_2102.11210, title={ Non-Convex Optimization with Spectral Radius Regularization }, author={ Adam Sandler and Diego Klabjan and Yuan Luo }, journal={arXiv preprint arXiv:2102.11210}, year={ 2025 } }