Exponentially vanishing sub-optimal local minima in multilayer neural networks

19 February 2017

Papers citing "Exponentially vanishing sub-optimal local minima in multilayer neural networks"

37 / 37 papers shown

Title
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 55 0 0 08 Feb 2024
Improved Convergence Guarantees for Shallow Neural Networks A. Razborov ODL 39 1 0 05 Dec 2022
Random matrix analysis of deep neural network weight matrices M. Thamm Max Staats B. Rosenow 42 12 0 28 Mar 2022
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions Patrick Cheridito Arnulf Jentzen Florian Rossmannek 34 10 0 19 Mar 2021
The Limit of the Batch Size Yang You Yuhui Wang Huan Zhang Zhao-jie Zhang J. Demmel Cho-Jui Hsieh 41 15 0 15 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 34 37 0 12 Jun 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity Shiyu Liang Ruoyu Sun R. Srikant 50 19 0 31 Dec 2019
Optimization for deep learning: theory and algorithms Ruoyu Sun ODL 61 168 0 19 Dec 2019
The Local Elasticity of Neural Networks Hangfeng He Weijie J. Su 52 45 0 15 Oct 2019
Exploring Model-based Planning with Policy Networks Tingwu Wang Jimmy Ba 49 148 0 20 Jun 2019
On the Power and Limitations of Random Features for Understanding Neural Networks Gilad Yehudai Ohad Shamir MLT 38 182 0 01 Apr 2019
Augment your batch: better training with larger batches Elad Hoffer Tal Ben-Nun Itay Hubara Niv Giladi Torsten Hoefler Daniel Soudry ODL 52 72 0 27 Jan 2019
Elimination of All Bad Local Minima in Deep Learning Kenji Kawaguchi L. Kaelbling 30 44 0 02 Jan 2019
On the Benefit of Width for Neural Networks: Disappearance of Bad Basins Dawei Li Tian Ding Ruoyu Sun 52 38 0 28 Dec 2018
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks Henning Petzka C. Sminchisescu 50 9 0 16 Dec 2018
Gradient Descent Finds Global Minima of Deep Neural Networks S. Du Jason D. Lee Haochuan Li Liwei Wang Masayoshi Tomizuka ODL 53 1,130 0 09 Nov 2018
Learning One-hidden-layer Neural Networks under General Input Distributions Weihao Gao Ashok Vardhan Makkuva Sewoong Oh Pramod Viswanath MLT 42 28 0 09 Oct 2018
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning Charles H. Martin Michael W. Mahoney AI4CE 75 195 0 02 Oct 2018
On the loss landscape of a class of deep neural networks with no bad local valleys Quynh N. Nguyen Mahesh Chandra Mukkamala Matthias Hein 34 87 0 27 Sep 2018
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach Ryo Karakida S. Akaho S. Amari FedML 73 142 0 04 Jun 2018
Understanding Generalization and Optimization Performance of Deep CNNs Pan Zhou Jiashi Feng MLT 60 49 0 28 May 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport Lénaïc Chizat Francis R. Bach OT 95 726 0 24 May 2018
Adding One Neuron Can Eliminate All Bad Local Minima Shiyu Liang Ruoyu Sun Jason D. Lee R. Srikant 55 89 0 22 May 2018
Mad Max: Affine Spline Insights into Deep Learning Randall Balestriero Richard Baraniuk AI4CE 41 78 0 17 May 2018
The Global Optimization Geometry of Shallow Linear Neural Networks Zhihui Zhu Daniel Soudry Yonina C. Eldar M. Wakin ODL 31 36 0 13 May 2018
On the Power of Over-parametrization in Neural Networks with Quadratic Activation S. Du Jason D. Lee 62 268 0 03 Mar 2018
Understanding the Loss Surface of Neural Networks for Binary Classification Shiyu Liang Ruoyu Sun Yixuan Li R. Srikant 40 87 0 19 Feb 2018
Fix your classifier: the marginal value of training the last weight layer Elad Hoffer Itay Hubara Daniel Soudry 57 101 0 14 Jan 2018
The Multilinear Structure of ReLU Networks T. Laurent J. V. Brecht 33 51 0 29 Dec 2017
Visualizing the Loss Landscape of Neural Nets Hao Li Zheng Xu Gavin Taylor Christoph Studer Tom Goldstein 178 1,870 0 28 Dec 2017
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data Alon Brutzkus Amir Globerson Eran Malach Shai Shalev-Shwartz MLT 78 277 0 27 Oct 2017
Characterization of Gradient Dominance and Regularity Conditions for Neural Networks Yi Zhou Yingbin Liang 27 33 0 18 Oct 2017
High-dimensional dynamics of generalization error in neural networks Madhu S. Advani Andrew M. Saxe AI4CE 97 464 0 10 Oct 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks Mahdi Soltanolkotabi Adel Javanmard Jason D. Lee 51 417 0 16 Jul 2017
Train longer, generalize better: closing the generalization gap in large batch training of neural networks Elad Hoffer Itay Hubara Daniel Soudry ODL 69 797 0 24 May 2017
The loss surface of deep and wide neural networks Quynh N. Nguyen Matthias Hein ODL 61 283 0 26 Apr 2017
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 192 1,187 0 30 Nov 2014