599

Exact Solutions of a Deep Linear Network

Neural Information Processing Systems (NeurIPS), 2022
Abstract

This work finds the exact solutions to a deep linear network with weight decay and stochastic neurons, a fundamental model for understanding the landscape of neural networks. Our result implies that weight decay strongly interacts with the model architecture and can create bad minima in a network with more than 11 hidden layer, qualitatively different for a network with only 11 hidden layer. As an application, we also analyze stochastic nets and show that their prediction variance vanishes to zero as the stochasticity, the width, or the depth tends to infinity.

View on arXiv
Comments on this paper