ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLT
    ODL
ArXivPDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 244 papers shown
Title
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast
  Evasion of Non-Degenerate Saddle Points
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
23
2
0
07 Dec 2022
Infinite-width limit of deep linear neural networks
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
31
14
0
29 Nov 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via
  Weight-Data Correlation Preprocessing
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao-quan Song
Ruizhe Zhang
Danyang Zhuo
71
31
0
25 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion
Characterizing the Spectrum of the NTK via a Power Series Expansion
Michael Murray
Hui Jin
Benjamin Bowman
Guido Montúfar
30
11
0
15 Nov 2022
LOFT: Finding Lottery Tickets through Filter-wise Training
LOFT: Finding Lottery Tickets through Filter-wise Training
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
18
3
0
28 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
18
5
0
20 Oct 2022
Theoretical Guarantees for Permutation-Equivariant Quantum Neural
  Networks
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks
Louis Schatzki
Martín Larocca
Quynh T. Nguyen
F. Sauvage
M. Cerezo
27
84
0
18 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic
  Gradient Descent
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
21
4
0
13 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
Nikolaos Tsilivis
Julia Kempe
AAML
39
16
0
11 Oct 2022
Efficient NTK using Dimensionality Reduction
Efficient NTK using Dimensionality Reduction
Nir Ailon
Supratim Shit
26
0
0
10 Oct 2022
Dynamical Isometry for Residual Networks
Dynamical Isometry for Residual Networks
Advait Gadhikar
R. Burkholz
ODL
AI4CE
32
2
0
05 Oct 2022
On the optimization and generalization of overparameterized implicit
  neural networks
On the optimization and generalization of overparameterized implicit neural networks
Tianxiang Gao
Hongyang Gao
MLT
AI4CE
19
3
0
30 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule
  based on example difficulty
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
Thomas George
Guillaume Lajoie
A. Baratin
23
5
0
19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural
  Networks in $1d$
Approximation results for Gradient Descent trained Shallow Neural Networks in 1d1d1d
R. Gentile
G. Welper
ODL
44
6
0
17 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the
  ugly (initialization)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
39
19
0
15 Sep 2022
Towards Understanding Mixture of Experts in Deep Learning
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
27
53
0
04 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge
  of Stability
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
42
0
26 Jul 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the
  Computational Limit
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
25
123
0
18 Jul 2022
Efficient Augmentation for Imbalanced Deep Learning
Efficient Augmentation for Imbalanced Deep Learning
Damien Dablain
C. Bellinger
Bartosz Krawczyk
Nitesh V. Chawla
22
7
0
13 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On
  Equivalence to Mirror Descent
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
32
27
0
08 Jul 2022
Neural Networks can Learn Representations with Gradient Descent
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
17
112
0
30 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A
  Worst Case Analysis
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao-quan Song
David P. Woodruff
25
15
0
26 Jun 2022
Learning sparse features can lead to overfitting in neural networks
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
M. Wyart
MLT
29
23
0
24 Jun 2022
Large-width asymptotics for ReLU neural networks with $α$-Stable
  initializations
Large-width asymptotics for ReLU neural networks with ααα-Stable initializations
Stefano Favaro
S. Fortini
Stefano Peluchetti
20
2
0
16 Jun 2022
Why Quantization Improves Generalization: NTK of Binary Weight Neural
  Networks
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Kaiqi Zhang
Ming Yin
Yu-Xiang Wang
MQ
16
4
0
13 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient
  Algorithms
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
21
71
0
08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently
  learn low-degree plus sparse polynomials
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
21
10
0
08 Jun 2022
Non-convex online learning via algorithmic equivalence
Non-convex online learning via algorithmic equivalence
Udaya Ghai
Zhou Lu
Elad Hazan
14
8
0
30 May 2022
Global Convergence of Over-parameterized Deep Equilibrium Models
Global Convergence of Over-parameterized Deep Equilibrium Models
Zenan Ling
Xingyu Xie
Qiuhao Wang
Zongpeng Zhang
Zhouchen Lin
27
12
0
27 May 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite
  Width
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width
Hanxu Zhou
Qixuan Zhou
Zhenyuan Jin
Tao Luo
Yaoyu Zhang
Zhi-Qin John Xu
22
20
0
24 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic
  Graph Architecture
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Libin Zhu
Chaoyue Liu
M. Belkin
GNN
AI4CE
15
4
0
24 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU
  Networks: Convergence Guarantees and Implicit Bias
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
51
23
0
18 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks
Trading Positional Complexity vs. Deepness in Coordinate Networks
Jianqiao Zheng
Sameera Ramasinghe
Xueqian Li
Simon Lucey
23
18
0
18 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step
  Improves the Representation
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
29
121
0
03 May 2022
Dynamic Programming in Rank Space: Scaling Structured Inference with
  Low-Rank HMMs and PCFGs
Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs
Songlin Yang
Wei Liu
Kewei Tu
13
8
0
01 May 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural
  Network Loss Landscapes
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
25
27
0
24 Apr 2022
Theory of Graph Neural Networks: Representation and Learning
Theory of Graph Neural Networks: Representation and Learning
Stefanie Jegelka
GNN
AI4CE
33
68
0
16 Apr 2022
On Convergence Lemma and Convergence Stability for Piecewise Analytic
  Functions
On Convergence Lemma and Convergence Stability for Piecewise Analytic Functions
Xiaotie Deng
Hanyu Li
Ningyuan Li
8
0
0
04 Apr 2022
Convergence of gradient descent for deep neural networks
Convergence of gradient descent for deep neural networks
S. Chatterjee
ODL
19
20
0
30 Mar 2022
On the (Non-)Robustness of Two-Layer Neural Networks in Different
  Learning Regimes
On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes
Elvis Dohmatob
A. Bietti
AAML
21
13
0
22 Mar 2022
The Spectral Bias of Polynomial Neural Networks
The Spectral Bias of Polynomial Neural Networks
Moulik Choraria
L. Dadi
Grigorios G. Chrysos
Julien Mairal
V. Cevher
22
18
0
27 Feb 2022
Sparse Neural Additive Model: Interpretable Deep Learning with Feature
  Selection via Group Sparsity
Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity
Shiyun Xu
Zhiqi Bu
Pratik Chaudhari
Ian J. Barnett
19
21
0
25 Feb 2022
Random Feature Amplification: Feature Learning and Generalization in
  Neural Networks
Random Feature Amplification: Feature Learning and Generalization in Neural Networks
Spencer Frei
Niladri S. Chatterji
Peter L. Bartlett
MLT
30
29
0
15 Feb 2022
Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably
Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably
Tianyi Liu
Yan Li
Enlu Zhou
Tuo Zhao
38
1
0
07 Feb 2022
Demystify Optimization and Generalization of Over-parameterized
  PAC-Bayesian Learning
Demystify Optimization and Generalization of Over-parameterized PAC-Bayesian Learning
Wei Huang
Chunrui Liu
Yilan Chen
Tianyu Liu
R. Xu
BDL
MLT
19
2
0
04 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic
  Gradient Descent for Shallow Neural Networks
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
28
3
0
28 Jan 2022
How does unlabeled data improve generalization in self-training? A
  one-hidden-layer theoretical analysis
How does unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
Shuai Zhang
M. Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
SSL
MLT
39
22
0
21 Jan 2022
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural
  Networks
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks
Benjamin Bowman
Guido Montúfar
18
11
0
12 Jan 2022
Rethinking Influence Functions of Neural Networks in the
  Over-parameterized Regime
Rethinking Influence Functions of Neural Networks in the Over-parameterized Regime
Rui Zhang
Shihua Zhang
TDI
19
21
0
15 Dec 2021
Previous
12345
Next