No bad local minima: Data independent training error guarantees for multilayer neural networks

26 May 2016

Papers citing "No bad local minima: Data independent training error guarantees for multilayer neural networks"

48 / 48 papers shown

Title
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 40 0 0 08 Feb 2024
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Z. Luo 26 10 0 21 Oct 2022
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks Kaiqi Zhang Ming Yin Yu-Xiang Wang MQ 16 4 0 13 Jun 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape Devansh Bisla Jing Wang A. Choromańska 25 34 0 20 Jan 2022
Exponentially Many Local Minima in Quantum Neural Networks Xuchen You Xiaodi Wu 66 51 0 06 Oct 2021
Self-Paced Contrastive Learning for Semi-supervised Medical Image Segmentation with Meta-labels Jizong Peng Ping Wang Chrisitian Desrosiers M. Pedersoli SSL 29 63 0 29 Jul 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions Patrick Cheridito Arnulf Jentzen Florian Rossmannek 24 10 0 19 Mar 2021
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu Yuanzhi Li FedML 31 354 0 17 Dec 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 14 37 0 12 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning Zeyuan Allen-Zhu Yuanzhi Li MLT AAML 27 146 0 20 May 2020
A study of local optima for learning feature interactions using neural networks Yangzi Guo Adrian Barbu 11 1 0 11 Feb 2020
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape Johanni Brea Berfin Simsek Bernd Illing W. Gerstner 15 55 0 05 Jul 2019
Robust and Resource Efficient Identification of Two Hidden Layer Neural Networks M. Fornasier T. Klock Michael Rauchensteiner 19 17 0 30 Jun 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels? Zeyuan Allen-Zhu Yuanzhi Li 24 183 0 24 May 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks Mingchen Li Mahdi Soltanolkotabi Samet Oymak NoLa 26 350 0 27 Mar 2019
Understanding over-parameterized deep networks by geometrization Xiao Dong Ling Zhou GNN AI4CE 16 7 0 11 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruosong Wang MLT 35 961 0 24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks S. Du Wei Hu 16 93 0 24 Jan 2019
Scaling description of generalization with number of parameters in deep learning Mario Geiger Arthur Jacot S. Spigler Franck Gabriel Levent Sagun Stéphane dÁscoli Giulio Biroli Clément Hongler M. Wyart 36 194 0 06 Jan 2019
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks Henning Petzka C. Sminchisescu 25 9 0 16 Dec 2018
Gradient Descent Finds Global Minima of Deep Neural Networks S. Du J. Lee Haochuan Li Liwei Wang M. Tomizuka ODL 15 1,120 0 09 Nov 2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity Chulhee Yun S. Sra Ali Jadbabaie 13 117 0 17 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel Colin Wei J. Lee Qiang Liu Tengyu Ma 18 243 0 12 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks Sanjeev Arora Nadav Cohen Noah Golowich Wei Hu 11 280 0 04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks S. Du Xiyu Zhai Barnabás Póczós Aarti Singh MLT ODL 33 1,251 0 04 Oct 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent Xiao Zhang Yaodong Yu Lingxiao Wang Quanquan Gu MLT 26 134 0 20 Jun 2018
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning Dong Yin Yudong Chen K. Ramchandran Peter L. Bartlett FedML 24 97 0 14 Jun 2018
Adding One Neuron Can Eliminate All Bad Local Minima Shiyu Liang Ruoyu Sun J. Lee R. Srikant 29 89 0 22 May 2018
End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition Samet Oymak Mahdi Soltanolkotabi 19 12 0 16 May 2018
The Global Optimization Geometry of Shallow Linear Neural Networks Zhihui Zhu Daniel Soudry Yonina C. Eldar M. Wakin ODL 16 36 0 13 May 2018
The Loss Surface of XOR Artificial Neural Networks D. Mehta Xiaojun Zhao Edgar A. Bernal D. Wales 29 19 0 06 Apr 2018
Comparing Dynamics: Deep Neural Networks versus Glassy Systems M. Baity-Jesi Levent Sagun Mario Geiger S. Spigler Gerard Ben Arous C. Cammarota Yann LeCun M. Wyart Giulio Biroli AI4CE 23 113 0 19 Mar 2018
Constrained Deep Learning using Conditional Gradient and Applications in Computer Vision Sathya Ravi Tuan Dinh Vishnu Suresh Lokhande Vikas Singh AI4CE 13 22 0 17 Mar 2018
Essentially No Barriers in Neural Network Energy Landscape Felix Dräxler K. Veschgini M. Salmhofer Fred Hamprecht MoMe 20 424 0 02 Mar 2018
Deep Neural Networks Learn Non-Smooth Functions Effectively Masaaki Imaizumi Kenji Fukumizu 18 123 0 13 Feb 2018
Fix your classifier: the marginal value of training the last weight layer Elad Hoffer Itay Hubara Daniel Soudry 27 101 0 14 Jan 2018
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks Itay Safran Ohad Shamir 22 261 0 24 Dec 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks Mahdi Soltanolkotabi Adel Javanmard J. Lee 25 414 0 16 Jul 2017
Global optimality conditions for deep neural networks Chulhee Yun S. Sra Ali Jadbabaie 121 117 0 08 Jul 2017
Recovery Guarantees for One-hidden-layer Neural Networks Kai Zhong Zhao-quan Song Prateek Jain Peter L. Bartlett Inderjit S. Dhillon MLT 15 335 0 10 Jun 2017
Deep Relaxation: partial differential equations for optimizing deep neural networks Pratik Chaudhari Adam M. Oberman Stanley Osher Stefano Soatto G. Carlier 18 153 0 17 Apr 2017
Convergence Results for Neural Networks via Electrodynamics Rina Panigrahy Sushant Sachdeva Qiuyi Zhang MLT MDE 24 22 0 01 Feb 2017
An empirical analysis of the optimization of deep network loss surfaces Daniel Jiwoong Im Michael Tao K. Branson ODL 27 61 0 13 Dec 2016
Identity Matters in Deep Learning Moritz Hardt Tengyu Ma OOD 25 398 0 14 Nov 2016
Topology and Geometry of Half-Rectified Network Optimization C. Freeman Joan Bruna 19 233 0 04 Nov 2016
Piecewise convexity of artificial neural networks Blaine Rister Daniel L Rubin AAML ODL 26 31 0 17 Jul 2016
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 179 1,185 0 30 Nov 2014
Improving neural networks by preventing co-adaptation of feature detectors Geoffrey E. Hinton Nitish Srivastava A. Krizhevsky Ilya Sutskever Ruslan Salakhutdinov VLM 266 7,634 0 03 Jul 2012