Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.08968
Cited By
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
24 December 2017
Itay Safran
Ohad Shamir
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Spurious Local Minima are Common in Two-Layer ReLU Neural Networks"
50 / 50 papers shown
Title
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Ziang Chen
Rong Ge
MLT
59
1
0
10 Jan 2025
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
44
0
0
13 Nov 2024
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
33
4
0
12 Mar 2024
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong
Lijun Ding
Simon S. Du
26
11
0
03 Oct 2023
Worrisome Properties of Neural Network Controllers and Their Symbolic Representations
J. Cyranka
Kevin E. M. Church
J. Lessard
34
0
0
28 Jul 2023
NTK-SAP: Improving neural network pruning by aligning training dynamics
Yite Wang
Dawei Li
Ruoyu Sun
28
19
0
06 Apr 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
26
16
0
20 Feb 2023
An SDE for Modeling SAM: Theory and Insights
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurélien Lucchi
21
13
0
19 Jan 2023
Regression as Classification: Influence of Task Formulation on Neural Network Features
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
27
24
0
10 Nov 2022
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Z. Luo
21
10
0
21 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
16
8
0
12 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
34
4
0
01 Oct 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
51
23
0
18 May 2022
Self-scalable Tanh (Stan): Faster Convergence and Better Generalization in Physics-informed Neural Networks
Raghav Gnanasambandam
Bo Shen
Jihoon Chung
Xubo Yue
Zhenyu
Zhen Kong
LRM
26
12
0
26 Apr 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
28
3
0
28 Jan 2022
How does unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
Shuai Zhang
M. Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
SSL
MLT
39
22
0
21 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Mode connectivity in the loss landscape of parameterized quantum circuits
Kathleen E. Hamilton
E. Lynn
R. Pooser
25
3
0
09 Nov 2021
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Tolga Ergen
Mert Pilanci
24
16
0
18 Oct 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
UQCV
MLT
18
13
0
12 Oct 2021
Exponentially Many Local Minima in Quantum Neural Networks
Xuchen You
Xiaodi Wu
62
51
0
06 Oct 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
Yossi Arjevani
M. Field
28
18
0
21 Jul 2021
Equivariant bifurcation, quadratic equivariants, and symmetry breaking for the standard representation of
S
n
S_n
S
n
Yossi Arjevani
M. Field
27
8
0
06 Jul 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
21
10
0
19 Mar 2021
Understanding self-supervised Learning Dynamics without Contrastive Pairs
Yuandong Tian
Xinlei Chen
Surya Ganguli
SSL
138
279
0
12 Feb 2021
The Nonconvex Geometry of Linear Inverse Problems
Armin Eftekhari
Peyman Mohajerin Esfahani
13
1
0
07 Jan 2021
Learning Graph Neural Networks with Approximate Gradient Descent
Qunwei Li
Shaofeng Zou
Leon Wenliang Zhong
GNN
25
1
0
07 Dec 2020
PAC Confidence Predictions for Deep Neural Network Classifiers
Sangdon Park
Shuo Li
Insup Lee
Osbert Bastani
UQCV
8
25
0
02 Nov 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
26
13
0
23 Mar 2020
Growing axons: greedy learning of neural networks with application to function approximation
Daria Fokina
Ivan V. Oseledets
9
18
0
28 Oct 2019
Neural ODEs as the Deep Limit of ResNets with constant weights
B. Avelin
K. Nystrom
ODL
32
31
0
28 Jun 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
23
961
0
24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
11
93
0
24 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
11
446
0
21 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
J. Lee
Haochuan Li
Liwei Wang
M. Tomizuka
ODL
15
1,120
0
09 Nov 2018
A Closer Look at Deep Policy Gradients
Andrew Ilyas
Logan Engstrom
Shibani Santurkar
Dimitris Tsipras
Firdaus Janoos
Larry Rudolph
Aleksander Madry
20
50
0
06 Nov 2018
Benefits of over-parameterization with EM
Ji Xu
Daniel J. Hsu
A. Maleki
28
29
0
26 Oct 2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun
S. Sra
Ali Jadbabaie
13
117
0
17 Oct 2018
Learning Two-layer Neural Networks with Symmetric Inputs
Rong Ge
Rohith Kuditipudi
Zhize Li
Xiang Wang
OOD
MLT
25
57
0
16 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
6
280
0
04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
18
1,251
0
04 Oct 2018
Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization
G. Wang
G. Giannakis
Jie Chen
MLT
22
131
0
14 Aug 2018
Model Reconstruction from Model Explanations
S. Milli
Ludwig Schmidt
Anca Dragan
Moritz Hardt
FAtt
14
177
0
13 Jul 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent
Xiao Zhang
Yaodong Yu
Lingxiao Wang
Quanquan Gu
MLT
24
134
0
20 Jun 2018
The committee machine: Computational to statistical gaps in learning a two-layers neural network
Benjamin Aubin
Antoine Maillard
Jean Barbier
Florent Krzakala
N. Macris
Lenka Zdeborová
30
103
0
14 Jun 2018
How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network?
S. Du
Yining Wang
Xiyu Zhai
Sivaraman Balakrishnan
Ruslan Salakhutdinov
Aarti Singh
SSL
13
57
0
21 May 2018
Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps
S. Du
Surbhi Goel
MLT
20
17
0
20 May 2018
The Global Optimization Geometry of Shallow Linear Neural Networks
Zhihui Zhu
Daniel Soudry
Yonina C. Eldar
M. Wakin
ODL
16
36
0
13 May 2018
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks
Mahdi Soltanolkotabi
Adel Javanmard
J. Lee
25
414
0
16 Jul 2017
1