Deep Learning without Poor Local Minima

23 May 2016

Papers citing "Deep Learning without Poor Local Minima"

14 / 14 papers shown

Title
Stacking as Accelerated Gradient Descent Naman Agarwal Pranjal Awasthi Satyen Kale Eric Zhao ODL 87 2 0 20 Feb 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input Ziang Chen Rong Ge MLT 84 1 0 10 Jan 2025
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning Yehonathan Refael Jonathan Svirsky Boris Shustin Wasim Huleihel Ofir Lindenbaum 61 3 0 31 Dec 2024
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks Ann Huang Satpreet H. Singh Flavio Martinelli Kanaka Rajan 47 0 0 04 Oct 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 61 0 0 08 Feb 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization Sungbin Shin Dongyeop Lee Maksym Andriushchenko Namhoon Lee AAML 69 1 0 29 Nov 2023
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation Robert Mansel Gower Othmane Sebbouh Nicolas Loizou 57 75 0 18 Jun 2020
An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias Lu Yu Krishnakumar Balasubramanian S. Volgushev Murat A. Erdogdu 65 50 0 14 Jun 2020
Beyond Random Matrix Theory for Deep Networks Diego Granziol 62 16 0 13 Jun 2020
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis Yuandong Tian MLT 90 216 0 02 Mar 2017
Deep Semi-Random Features for Nonlinear Function Approximation Kenji Kawaguchi Bo Xie Vikas Verma Le Song 104 15 0 28 Feb 2017
Exponentially vanishing sub-optimal local minima in multilayer neural networks Daniel Soudry Elad Hoffer 93 97 0 19 Feb 2017
Asynchronous Stochastic Gradient Descent with Delay Compensation Shuxin Zheng Qi Meng Taifeng Wang Wei Chen Nenghai Yu Zhiming Ma Tie-Yan Liu 75 313 0 27 Sep 2016
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Andrew M. Saxe James L. McClelland Surya Ganguli ODL 93 1,830 0 20 Dec 2013