Geometric structure of Deep Learning networks and construction of global ${\mathcal L}^2$ minimizers

19 September 2023

Abstract

In this paper, we provide a geometric interpretation of the structure of Deep Learning (DL) networks, characterized by $L$ hidden layers, a ramp activation function, an ${\mathcal L}^2$ Schatten class (or Hilbert-Schmidt) cost function, and input and output spaces ${\mathbb R}^Q$ with equal dimension $Q\geq1$ . The hidden layers are defined on spaces ${\mathbb R}^{Q}$ , as well. We apply our recent results on shallow neural networks to construct an explicit family of minimizers for the global minimum of the cost function in the case $L\geq Q$ , which we show to be degenerate. In the context presented here, the hidden layers of the DL network "curate" the training inputs by recursive application of a truncation map that minimizes the noise to signal ratio of the training inputs. Moreover, we determine a set of $2^Q-1$ distinct degenerate local minima of the cost function.

View on arXiv

Comments on this paper

Geometric structure of Deep Learning networks and construction of global L2{\mathcal L}^2L2 minimizers

Geometric structure of Deep Learning networks and construction of global ${\mathcal L}^2$ minimizers