Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1712.08968
Cited By
v1
v2
v3 (latest)
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
International Conference on Machine Learning (ICML), 2017
24 December 2017
Itay Safran
Ohad Shamir
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Spurious Local Minima are Common in Two-Layer ReLU Neural Networks"
50 / 183 papers shown
Block Coordinate Descent for Neural Networks Provably Finds Global Minima
Shunta Akiyama
173
2
0
26 Oct 2025
LLM Priors for ERM over Programs
Shivam Singhal
Eran Malach
T. Poggio
Tomer Galanti
138
1
0
16 Oct 2025
Deep Learning-based Lightweight RGB Object Tracking for Augmented Reality Devices
Alice Smith
Bob Johnson
Xiaoyu Zhu
Carol Lee
144
1
0
04 Oct 2025
Generative quantum advantage for classical and quantum problems
Hsin-Yuan Huang
Michael Broughton
Norhan Eassa
Hartmut Neven
Ryan Babbush
Jarrod R. McClean
262
12
0
10 Sep 2025
Flat Channels to Infinity in Neural Loss Landscapes
Flavio Martinelli
Alexander Van Meegen
Berfin Simsek
W. Gerstner
Johanni Brea
402
3
0
17 Jun 2025
Benignity of loss landscape with weight decay requires both large overparametrization and initialization
Etienne Boursier
Matthew Bowditch
Matthias Englert
R. Lazic
267
0
0
28 May 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Neural Information Processing Systems (NeurIPS), 2024
Ziang Chen
Rong Ge
MLT
483
1
0
10 Jan 2025
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
431
6
0
13 Nov 2024
The Persistence of Neural Collapse Despite Low-Rank Bias
Connall Garrod
Jonathan P. Keating
361
11
0
30 Oct 2024
Loss Landscape Characterization of Neural Networks without Over-Parametrization
Neural Information Processing Systems (NeurIPS), 2024
Rustem Islamov
Niccolò Ajroldi
Antonio Orvieto
Aurelien Lucchi
523
11
0
16 Oct 2024
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning
Shuai Zhang
Heshan Devaka Fernando
Miao Liu
K. Murugesan
Songtao Lu
Pin-Yu Chen
Tianyi Chen
Meng Wang
289
6
0
24 May 2024
Solution space and storage capacity of fully connected two-layer neural networks with generic activation functions
Sota Nishiyama
Masayuki Ohzeki
379
3
0
20 Apr 2024
Exploring Neural Network Landscapes: Star-Shaped and Geodesic Connectivity
Zhanran Lin
Puheng Li
Lei Wu
518
9
0
09 Apr 2024
The Real Tropical Geometry of Neural Networks
Marie-Charlotte Brandenburg
Georg Loho
Guido Montúfar
463
20
0
18 Mar 2024
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
IEEE Journal on Selected Topics in Signal Processing (JSTSP), 2024
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
327
7
0
12 Mar 2024
Improving Model Fusion by Training-time Neuron Alignment with Fixed Neuron Anchors
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Zexi Li
Zhiqi Li
Jie Lin
Zhenyuan Zhang
Tao Lin
Chao Wu
Tao Lin
Chao Wu
469
5
0
02 Feb 2024
RedEx: Beyond Fixed Representation Methods via Convex Optimization
International Conference on Algorithmic Learning Theory (ALT), 2024
Amit Daniely
Mariano Schain
Gilad Yehudai
247
1
0
15 Jan 2024
A topological description of loss surfaces based on Betti Numbers
Neural Networks (NN), 2024
Maria Sofia Bucarelli
Giuseppe Alessio D’Inverno
Monica Bianchini
F. Scarselli
Fabrizio Silvestri
189
5
0
08 Jan 2024
Minimum norm interpolation by perceptra: Explicit regularization and implicit bias
Neural Information Processing Systems (NeurIPS), 2023
Jiyoung Park
Ian Pelakh
Stephan Wojtowytsch
256
2
0
10 Nov 2023
On the Impact of Overparameterization on the Training of a Shallow Neural Network in High Dimensions
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Simon Martin
Francis Bach
Giulio Biroli
327
20
0
07 Nov 2023
Should Under-parameterized Student Networks Copy or Average Teacher Weights?
Neural Information Processing Systems (NeurIPS), 2023
Berfin Simsek
Amire Bendjeddou
W. Gerstner
Johanni Brea
376
10
0
03 Nov 2023
A qualitative difference between gradient flows of convex functions in finite- and infinite-dimensional Hilbert spaces
Jonathan W. Siegel
Stephan Wojtowytsch
276
5
0
26 Oct 2023
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with
ε
ε
ε
-Greedy Exploration
Neural Information Processing Systems (NeurIPS), 2023
Shuai Zhang
Hongkang Li
Meng Wang
Miao Liu
Pin-Yu Chen
Songtao Lu
Sijia Liu
K. Murugesan
Subhajit Chaudhury
371
48
0
24 Oct 2023
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
International Conference on Learning Representations (ICLR), 2023
Nuoya Xiong
Lijun Ding
Simon S. Du
549
21
0
03 Oct 2023
Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs
Neural Information Processing Systems (NeurIPS), 2023
Rajat Vadiraj Dwaraknath
Tolga Ergen
Mert Pilanci
348
0
0
26 Sep 2023
Worrisome Properties of Neural Network Controllers and Their Symbolic Representations
European Conference on Artificial Intelligence (ECAI), 2023
J. Cyranka
Kevin E. M. Church
J. Lessard
268
0
0
28 Jul 2023
Empirical Loss Landscape Analysis of Neural Network Activation Functions
Anna Sergeevna Bosman
A. Engelbrecht
Mardé Helbig
188
5
0
28 Jun 2023
Black holes and the loss landscape in machine learning
Journal of High Energy Physics (JHEP), 2023
P. Kumar
Taniya Mandal
Swapnamay Mondal
271
2
0
26 Jun 2023
Gradient is All You Need? How Consensus-Based Optimization can be Interpreted as a Stochastic Relaxation of Gradient Descent
Konstantin Riedl
T. Klock
Carina Geldhauser
M. Fornasier
309
10
0
16 Jun 2023
Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape
Kedar Karhadkar
Michael Murray
Hanna Tseran
Guido Montúfar
316
11
0
31 May 2023
Expand-and-Cluster: Parameter Recovery of Neural Networks
International Conference on Machine Learning (ICML), 2023
Flavio Martinelli
Berfin Simsek
W. Gerstner
Johanni Brea
618
15
0
25 Apr 2023
NTK-SAP: Improving neural network pruning by aligning training dynamics
International Conference on Learning Representations (ICLR), 2023
Yite Wang
Dawei Li
Tian Ding
329
34
0
06 Apr 2023
On the existence of optimal shallow feedforward networks with ReLU activation
Steffen Dereich
Sebastian Kassing
266
5
0
06 Mar 2023
On the existence of minimizers in shallow residual ReLU neural network optimization landscapes
SIAM Journal on Numerical Analysis (SINUM), 2023
Steffen Dereich
Arnulf Jentzen
Sebastian Kassing
381
9
0
28 Feb 2023
Random Teachers are Good Teachers
International Conference on Machine Learning (ICML), 2023
Felix Sarnthein
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
507
8
0
23 Feb 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Annual Conference Computational Learning Theory (COLT), 2023
Weihang Xu
S. Du
434
22
0
20 Feb 2023
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural Networks
International Conference on Learning Representations (ICLR), 2023
Shuai Zhang
Ming Wang
Pin-Yu Chen
Sijia Liu
Songtao Lu
Miaoyuan Liu
MLT
342
22
0
06 Feb 2023
An SDE for Modeling SAM: Theory and Insights
International Conference on Machine Learning (ICML), 2023
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurelien Lucchi
344
24
0
19 Jan 2023
Linear RNNs Provably Learn Linear Dynamic Systems
Lifu Wang
Tianyu Wang
Shengwei Yi
Bo Shen
Bo Hu
Xing Cao
260
0
0
19 Nov 2022
Regression as Classification: Influence of Task Formulation on Neural Network Features
International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
423
36
0
10 Nov 2022
Finite Sample Identification of Wide Shallow Neural Networks with Biases
M. Fornasier
T. Klock
Marco Mondelli
Michael Rauchensteiner
297
7
0
08 Nov 2022
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Neural Information Processing Systems (NeurIPS), 2022
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Tian Ding
Jianfeng Yao
375
11
0
21 Oct 2022
Dissipative residual layers for unsupervised implicit parameterization of data manifolds
Viktor Reshniak
190
0
0
13 Oct 2022
Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence
Diyuan Wu
Vyacheslav Kungurtsev
Marco Mondelli
238
3
0
13 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Neural Information Processing Systems (NeurIPS), 2022
Yossi Arjevani
M. Field
301
11
0
12 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
International Conference on Learning Representations (ICLR), 2022
Jianhao Ma
Li-Zhen Guo
Salar Fattahi
447
4
0
01 Oct 2022
Gradient descent provably escapes saddle points in the training of shallow ReLU networks
Journal of Optimization Theory and Applications (JOTA), 2022
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
296
10
0
03 Aug 2022
Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data
Annual Conference on Information Sciences and Systems (CISS), 2022
Hongkang Li
Shuai Zhang
Ming Wang
MLT
294
10
0
07 Jul 2022
Special Properties of Gradient Descent with Large Learning Rates
International Conference on Machine Learning (ICML), 2022
Amirkeivan Mohtashami
Martin Jaggi
Sebastian U. Stich
MLT
393
16
0
30 May 2022
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods
International Conference on Learning Representations (ICLR), 2022
Shunta Akiyama
Taiji Suzuki
348
9
0
30 May 2022
1
2
3
4
Next
Page 1 of 4