Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.02054
Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gradient Descent Provably Optimizes Over-parameterized Neural Networks"
50 / 244 papers shown
Title
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time
Zhao-quan Song
Licheng Zhang
Ruizhe Zhang
23
63
0
14 Dec 2021
SCORE: Approximating Curvature Information under Self-Concordant Regularization
Adeyemi Damilare Adeoye
Alberto Bemporad
8
4
0
14 Dec 2021
Provable Continual Learning via Sketched Jacobian Approximations
Reinhard Heckel
CLL
18
9
0
09 Dec 2021
On the Convergence of Shallow Neural Network Training with Randomly Masked Neurons
Fangshuo Liao
Anastasios Kyrillidis
36
16
0
05 Dec 2021
Fast Graph Neural Tangent Kernel via Kronecker Sketching
Shunhua Jiang
Yunze Man
Zhao-quan Song
Zheng Yu
Danyang Zhuo
23
6
0
04 Dec 2021
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao
Beidi Chen
Kaizhao Liang
Jiaming Yang
Zhao-quan Song
Atri Rudra
Christopher Ré
25
75
0
30 Nov 2021
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization
Thanh Nguyen-Tang
Sunil R. Gupta
A. Nguyen
Svetha Venkatesh
OffRL
14
28
0
27 Nov 2021
Learning with convolution and pooling operations in kernel methods
Theodor Misiakiewicz
Song Mei
MLT
15
29
0
16 Nov 2021
On the Equivalence between Neural Network and Support Vector Machine
Yilan Chen
Wei Huang
Lam M. Nguyen
Tsui-Wei Weng
AAML
17
18
0
11 Nov 2021
Subquadratic Overparameterization for Shallow Neural Networks
Chaehwan Song
Ali Ramezani-Kebrya
Thomas Pethick
Armin Eftekhari
V. Cevher
22
32
0
02 Nov 2021
Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher
Mehdi Rezagholizadeh
A. Jafari
Puneeth Salad
Pranav Sharma
Ali Saheb Pasand
A. Ghodsi
71
17
0
16 Oct 2021
Provable Regret Bounds for Deep Online Learning and Control
Xinyi Chen
Edgar Minasyan
Jason D. Lee
Elad Hazan
21
6
0
15 Oct 2021
AIR-Net: Adaptive and Implicit Regularization Neural Network for Matrix Completion
Zhemin Li
Tao Sun
Hongxia Wang
Bao Wang
42
6
0
12 Oct 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
UQCV
MLT
21
13
0
12 Oct 2021
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Jiayao Zhang
Hua Wang
Weijie J. Su
27
7
0
11 Oct 2021
Deep Bayesian inference for seismic imaging with tasks
Ali Siahkoohi
G. Rizzuti
Felix J. Herrmann
BDL
UQCV
30
21
0
10 Oct 2021
Does Preprocessing Help Training Over-parameterized Neural Networks?
Zhao-quan Song
Shuo Yang
Ruizhe Zhang
27
49
0
09 Oct 2021
New Insights into Graph Convolutional Networks using Neural Tangent Kernels
Mahalakshmi Sabanayagam
P. Esser
D. Ghoshdastidar
21
6
0
08 Oct 2021
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime
Zhiyan Ding
Shi Chen
Qin Li
S. Wright
MLT
AI4CE
30
11
0
06 Oct 2021
Efficient and Private Federated Learning with Partially Trainable Networks
Hakim Sidahmed
Zheng Xu
Ankush Garg
Yuan Cao
Mingqing Chen
FedML
49
13
0
06 Oct 2021
Theory of overparametrization in quantum neural networks
Martín Larocca
Nathan Ju
Diego García-Martín
Patrick J. Coles
M. Cerezo
32
188
0
23 Sep 2021
NASI: Label- and Data-agnostic Neural Architecture Search at Initialization
Yao Shu
Shaofeng Cai
Zhongxiang Dai
Beng Chin Ooi
K. H. Low
14
43
0
02 Sep 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
MLT
AI4CE
41
37
0
25 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games
Baihe Huang
Jason D. Lee
Zhaoran Wang
Zhuoran Yang
25
47
0
30 Jul 2021
Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation
Arnulf Jentzen
Adrian Riekert
19
23
0
09 Jul 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers
Gal Shachaf
Alon Brutzkus
Amir Globerson
26
17
0
04 Jul 2021
Random Neural Networks in the Infinite Width Limit as Gaussian Processes
Boris Hanin
BDL
24
43
0
04 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
36
259
0
01 Jul 2021
Locality defeats the curse of dimensionality in convolutional teacher-student scenarios
Alessandro Favero
Francesco Cagnetta
M. Wyart
22
31
0
16 Jun 2021
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms
A. Camuto
George Deligiannidis
Murat A. Erdogdu
Mert Gurbuzbalaban
Umut cSimcsekli
Lingjiong Zhu
23
29
0
09 Jun 2021
TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block Inversion
Saeed Soori
Bugra Can
Baourun Mu
Mert Gurbuzbalaban
M. Dehnavi
11
10
0
07 Jun 2021
Practical Convex Formulation of Robust One-hidden-layer Neural Network Training
Yatong Bai
Tanmay Gautam
Yujie Gai
Somayeh Sojoudi
AAML
19
3
0
25 May 2021
Global Convergence of Three-layer Neural Networks in the Mean Field Regime
H. Pham
Phan-Minh Nguyen
MLT
AI4CE
41
19
0
11 May 2021
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis
Baihe Huang
Xiaoxiao Li
Zhao-quan Song
Xin Yang
FedML
23
16
0
11 May 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization
Saurabh Garg
Sivaraman Balakrishnan
J. Zico Kolter
Zachary Chase Lipton
25
29
0
01 May 2021
Generalization Guarantees for Neural Architecture Search with Train-Validation Split
Samet Oymak
Mingchen Li
Mahdi Soltanolkotabi
AI4CE
OOD
31
13
0
29 Apr 2021
PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols
Songlin Yang
Yanpeng Zhao
Kewei Tu
18
22
0
28 Apr 2021
Understanding Overparameterization in Generative Adversarial Networks
Yogesh Balaji
M. Sajedi
N. Kalibhat
Mucong Ding
Dominik Stöger
Mahdi Soltanolkotabi
S. Feizi
AI4CE
12
21
0
12 Apr 2021
A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions
Arnulf Jentzen
Adrian Riekert
MLT
32
13
0
01 Apr 2021
The Discovery of Dynamics via Linear Multistep Methods and Deep Learning: Error Estimation
Q. Du
Yiqi Gu
Haizhao Yang
Chao Zhou
24
20
0
21 Mar 2021
Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy
Lucas Liebenwein
Cenk Baykal
Brandon Carter
David K Gifford
Daniela Rus
AAML
27
71
0
04 Mar 2021
Experiments with Rich Regime Training for Deep Learning
Xinyan Li
A. Banerjee
21
2
0
26 Feb 2021
Learning with invariances in random features and kernel models
Song Mei
Theodor Misiakiewicz
Andrea Montanari
OOD
46
89
0
25 Feb 2021
On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)
Zhiyuan Li
Sadhika Malladi
Sanjeev Arora
33
78
0
24 Feb 2021
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases
Arnulf Jentzen
T. Kröger
ODL
28
7
0
23 Feb 2021
GIST: Distributed Training for Large-Scale Graph Convolutional Networks
Cameron R. Wolfe
Jingkang Yang
Arindam Chowdhury
Chen Dun
Artun Bayer
Santiago Segarra
Anastasios Kyrillidis
BDL
GNN
LRM
41
9
0
20 Feb 2021
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions
Patrick Cheridito
Arnulf Jentzen
Adrian Riekert
Florian Rossmannek
23
24
0
19 Feb 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent
Shahar Azulay
E. Moroshko
Mor Shpigel Nacson
Blake E. Woodworth
Nathan Srebro
Amir Globerson
Daniel Soudry
AI4CE
25
73
0
19 Feb 2021
Reproducing Activation Function for Deep Learning
Senwei Liang
Liyao Lyu
Chunmei Wang
Haizhao Yang
28
21
0
13 Jan 2021
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
16
354
0
17 Dec 2020
Previous
1
2
3
4
5
Next