Geometric structure of shallow neural networks and constructive
cost minimization
In this paper, we provide a geometric interpretation of the structure of shallow neural networks characterized by one hidden layer, a ramp activation function, an Schatten class (or Hilbert-Schmidt) cost function, input space , output space with , and training input sample size . We prove an upper bound on the minimum of the cost function of order where measures the signal to noise ratio of training inputs. We obtain an approximate optimizer using projections adapted to the averages of training input vectors belonging to the same output vector , . In the special case , we explicitly determine an exact degenerate local minimum of the cost function; the sharp value differs from the upper bound obtained for by a relative error . The proof of the upper bound yields a constructively trained network; we show that it metrizes the -dimensional subspace in the input space spanned by , . We comment on the characterization of the global minimum of the cost function in the given context.
View on arXiv