Understanding Global Loss Landscape of One-hidden-layer ReLU Networks, Part 1: Theory

Abstract
For one-hidden-layer ReLU networks, we prove that all differentiable local minima are global inside differentiable regions. We give the locations and losses of differentiable local minima, and show that these local minima can be isolated points or continuous hyperplanes, depending on an interplay between data, activation pattern of hidden neurons and network size. Furthermore, we give necessary and sufficient conditions for the existence of saddle points as well as non-differentiable local minima, and their locations if they exist.
View on arXivComments on this paper