Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1605.08361
Cited By
No bad local minima: Data independent training error guarantees for multilayer neural networks
26 May 2016
Daniel Soudry
Y. Carmon
Re-assign community
ArXiv
PDF
HTML
Papers citing
"No bad local minima: Data independent training error guarantees for multilayer neural networks"
48 / 48 papers shown
Title
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
40
0
0
08 Feb 2024
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Z. Luo
26
10
0
21 Oct 2022
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Kaiqi Zhang
Ming Yin
Yu-Xiang Wang
MQ
16
4
0
13 Jun 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Exponentially Many Local Minima in Quantum Neural Networks
Xuchen You
Xiaodi Wu
66
51
0
06 Oct 2021
Self-Paced Contrastive Learning for Semi-supervised Medical Image Segmentation with Meta-labels
Jizong Peng
Ping Wang
Chrisitian Desrosiers
M. Pedersoli
SSL
29
63
0
29 Jul 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
24
10
0
19 Mar 2021
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
31
354
0
17 Dec 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
AAML
27
146
0
20 May 2020
A study of local optima for learning feature interactions using neural networks
Yangzi Guo
Adrian Barbu
11
1
0
11 Feb 2020
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape
Johanni Brea
Berfin Simsek
Bernd Illing
W. Gerstner
15
55
0
05 Jul 2019
Robust and Resource Efficient Identification of Two Hidden Layer Neural Networks
M. Fornasier
T. Klock
Michael Rauchensteiner
19
17
0
30 Jun 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
24
183
0
24 May 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
26
350
0
27 Mar 2019
Understanding over-parameterized deep networks by geometrization
Xiao Dong
Ling Zhou
GNN
AI4CE
16
7
0
11 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
35
961
0
24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
16
93
0
24 Jan 2019
Scaling description of generalization with number of parameters in deep learning
Mario Geiger
Arthur Jacot
S. Spigler
Franck Gabriel
Levent Sagun
Stéphane dÁscoli
Giulio Biroli
Clément Hongler
M. Wyart
36
194
0
06 Jan 2019
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks
Henning Petzka
C. Sminchisescu
25
9
0
16 Dec 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
J. Lee
Haochuan Li
Liwei Wang
M. Tomizuka
ODL
15
1,120
0
09 Nov 2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun
S. Sra
Ali Jadbabaie
13
117
0
17 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
J. Lee
Qiang Liu
Tengyu Ma
18
243
0
12 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
11
280
0
04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
33
1,251
0
04 Oct 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent
Xiao Zhang
Yaodong Yu
Lingxiao Wang
Quanquan Gu
MLT
26
134
0
20 Jun 2018
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning
Dong Yin
Yudong Chen
K. Ramchandran
Peter L. Bartlett
FedML
24
97
0
14 Jun 2018
Adding One Neuron Can Eliminate All Bad Local Minima
Shiyu Liang
Ruoyu Sun
J. Lee
R. Srikant
29
89
0
22 May 2018
End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition
Samet Oymak
Mahdi Soltanolkotabi
19
12
0
16 May 2018
The Global Optimization Geometry of Shallow Linear Neural Networks
Zhihui Zhu
Daniel Soudry
Yonina C. Eldar
M. Wakin
ODL
16
36
0
13 May 2018
The Loss Surface of XOR Artificial Neural Networks
D. Mehta
Xiaojun Zhao
Edgar A. Bernal
D. Wales
29
19
0
06 Apr 2018
Comparing Dynamics: Deep Neural Networks versus Glassy Systems
M. Baity-Jesi
Levent Sagun
Mario Geiger
S. Spigler
Gerard Ben Arous
C. Cammarota
Yann LeCun
M. Wyart
Giulio Biroli
AI4CE
23
113
0
19 Mar 2018
Constrained Deep Learning using Conditional Gradient and Applications in Computer Vision
Sathya Ravi
Tuan Dinh
Vishnu Suresh Lokhande
Vikas Singh
AI4CE
13
22
0
17 Mar 2018
Essentially No Barriers in Neural Network Energy Landscape
Felix Dräxler
K. Veschgini
M. Salmhofer
Fred Hamprecht
MoMe
20
424
0
02 Mar 2018
Deep Neural Networks Learn Non-Smooth Functions Effectively
Masaaki Imaizumi
Kenji Fukumizu
18
123
0
13 Feb 2018
Fix your classifier: the marginal value of training the last weight layer
Elad Hoffer
Itay Hubara
Daniel Soudry
27
101
0
14 Jan 2018
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
Itay Safran
Ohad Shamir
22
261
0
24 Dec 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks
Mahdi Soltanolkotabi
Adel Javanmard
J. Lee
25
414
0
16 Jul 2017
Global optimality conditions for deep neural networks
Chulhee Yun
S. Sra
Ali Jadbabaie
121
117
0
08 Jul 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao-quan Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
15
335
0
10 Jun 2017
Deep Relaxation: partial differential equations for optimizing deep neural networks
Pratik Chaudhari
Adam M. Oberman
Stanley Osher
Stefano Soatto
G. Carlier
18
153
0
17 Apr 2017
Convergence Results for Neural Networks via Electrodynamics
Rina Panigrahy
Sushant Sachdeva
Qiuyi Zhang
MLT
MDE
24
22
0
01 Feb 2017
An empirical analysis of the optimization of deep network loss surfaces
Daniel Jiwoong Im
Michael Tao
K. Branson
ODL
27
61
0
13 Dec 2016
Identity Matters in Deep Learning
Moritz Hardt
Tengyu Ma
OOD
25
398
0
14 Nov 2016
Topology and Geometry of Half-Rectified Network Optimization
C. Freeman
Joan Bruna
19
233
0
04 Nov 2016
Piecewise convexity of artificial neural networks
Blaine Rister
Daniel L Rubin
AAML
ODL
26
31
0
17 Jul 2016
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
179
1,185
0
30 Nov 2014
Improving neural networks by preventing co-adaptation of feature detectors
Geoffrey E. Hinton
Nitish Srivastava
A. Krizhevsky
Ilya Sutskever
Ruslan Salakhutdinov
VLM
266
7,634
0
03 Jul 2012
1