Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.02054
Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gradient Descent Provably Optimizes Over-parameterized Neural Networks"
50 / 244 papers shown
Title
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
23
2
0
07 Dec 2022
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
31
14
0
29 Nov 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao-quan Song
Ruizhe Zhang
Danyang Zhuo
71
31
0
25 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion
Michael Murray
Hui Jin
Benjamin Bowman
Guido Montúfar
30
11
0
15 Nov 2022
LOFT: Finding Lottery Tickets through Filter-wise Training
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
18
3
0
28 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
18
5
0
20 Oct 2022
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks
Louis Schatzki
Martín Larocca
Quynh T. Nguyen
F. Sauvage
M. Cerezo
27
84
0
18 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
21
4
0
13 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
Nikolaos Tsilivis
Julia Kempe
AAML
39
16
0
11 Oct 2022
Efficient NTK using Dimensionality Reduction
Nir Ailon
Supratim Shit
26
0
0
10 Oct 2022
Dynamical Isometry for Residual Networks
Advait Gadhikar
R. Burkholz
ODL
AI4CE
32
2
0
05 Oct 2022
On the optimization and generalization of overparameterized implicit neural networks
Tianxiang Gao
Hongyang Gao
MLT
AI4CE
19
3
0
30 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
Thomas George
Guillaume Lajoie
A. Baratin
23
5
0
19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
44
6
0
17 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
39
19
0
15 Sep 2022
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
27
53
0
04 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
42
0
26 Jul 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
25
123
0
18 Jul 2022
Efficient Augmentation for Imbalanced Deep Learning
Damien Dablain
C. Bellinger
Bartosz Krawczyk
Nitesh V. Chawla
22
7
0
13 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
32
27
0
08 Jul 2022
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
17
112
0
30 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao-quan Song
David P. Woodruff
25
15
0
26 Jun 2022
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
M. Wyart
MLT
29
23
0
24 Jun 2022
Large-width asymptotics for ReLU neural networks with
α
α
α
-Stable initializations
Stefano Favaro
S. Fortini
Stefano Peluchetti
20
2
0
16 Jun 2022
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Kaiqi Zhang
Ming Yin
Yu-Xiang Wang
MQ
16
4
0
13 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
21
71
0
08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
21
10
0
08 Jun 2022
Non-convex online learning via algorithmic equivalence
Udaya Ghai
Zhou Lu
Elad Hazan
14
8
0
30 May 2022
Global Convergence of Over-parameterized Deep Equilibrium Models
Zenan Ling
Xingyu Xie
Qiuhao Wang
Zongpeng Zhang
Zhouchen Lin
27
12
0
27 May 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width
Hanxu Zhou
Qixuan Zhou
Zhenyuan Jin
Tao Luo
Yaoyu Zhang
Zhi-Qin John Xu
22
20
0
24 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Libin Zhu
Chaoyue Liu
M. Belkin
GNN
AI4CE
15
4
0
24 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
51
23
0
18 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks
Jianqiao Zheng
Sameera Ramasinghe
Xueqian Li
Simon Lucey
23
18
0
18 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
29
121
0
03 May 2022
Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs
Songlin Yang
Wei Liu
Kewei Tu
13
8
0
01 May 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
25
27
0
24 Apr 2022
Theory of Graph Neural Networks: Representation and Learning
Stefanie Jegelka
GNN
AI4CE
33
68
0
16 Apr 2022
On Convergence Lemma and Convergence Stability for Piecewise Analytic Functions
Xiaotie Deng
Hanyu Li
Ningyuan Li
8
0
0
04 Apr 2022
Convergence of gradient descent for deep neural networks
S. Chatterjee
ODL
19
20
0
30 Mar 2022
On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes
Elvis Dohmatob
A. Bietti
AAML
21
13
0
22 Mar 2022
The Spectral Bias of Polynomial Neural Networks
Moulik Choraria
L. Dadi
Grigorios G. Chrysos
Julien Mairal
V. Cevher
22
18
0
27 Feb 2022
Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity
Shiyun Xu
Zhiqi Bu
Pratik Chaudhari
Ian J. Barnett
19
21
0
25 Feb 2022
Random Feature Amplification: Feature Learning and Generalization in Neural Networks
Spencer Frei
Niladri S. Chatterji
Peter L. Bartlett
MLT
30
29
0
15 Feb 2022
Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably
Tianyi Liu
Yan Li
Enlu Zhou
Tuo Zhao
38
1
0
07 Feb 2022
Demystify Optimization and Generalization of Over-parameterized PAC-Bayesian Learning
Wei Huang
Chunrui Liu
Yilan Chen
Tianyu Liu
R. Xu
BDL
MLT
19
2
0
04 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
28
3
0
28 Jan 2022
How does unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
Shuai Zhang
M. Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
SSL
MLT
39
22
0
21 Jan 2022
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks
Benjamin Bowman
Guido Montúfar
18
11
0
12 Jan 2022
Rethinking Influence Functions of Neural Networks in the Over-parameterized Regime
Rui Zhang
Shihua Zhang
TDI
19
21
0
15 Dec 2021
Previous
1
2
3
4
5
Next