Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.03962
Cited By
A Convergence Theory for Deep Learning via Over-Parameterization
9 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Convergence Theory for Deep Learning via Over-Parameterization"
50 / 370 papers shown
Title
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths
Quynh N. Nguyen
47
48
0
24 Jan 2021
Reproducing Activation Function for Deep Learning
Senwei Liang
Liyao Lyu
Chunmei Wang
Haizhao Yang
36
21
0
13 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
Rong Jin
41
3
0
12 Jan 2021
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
Spencer Frei
Yuan Cao
Quanquan Gu
FedML
MLT
70
19
0
04 Jan 2021
Understanding and Increasing Efficiency of Frank-Wolfe Adversarial Training
Theodoros Tsiligkaridis
Jay Roberts
AAML
22
11
0
22 Dec 2020
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks
Quynh N. Nguyen
Marco Mondelli
Guido Montúfar
25
81
0
21 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
60
356
0
17 Dec 2020
Estimation of the Mean Function of Functional Data via Deep Neural Networks
Shuoyang Wang
Guanqun Cao
Zuofeng Shang
35
20
0
08 Dec 2020
Gradient Starvation: A Learning Proclivity in Neural Networks
Mohammad Pezeshki
Sekouba Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
MLT
50
258
0
18 Nov 2020
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting
Zeke Xie
Fengxiang He
Shaopeng Fu
Issei Sato
Dacheng Tao
Masashi Sugiyama
21
60
0
12 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
39
18
0
09 Nov 2020
Are wider nets better given the same number of parameters?
A. Golubeva
Behnam Neyshabur
Guy Gur-Ari
27
44
0
27 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks
Zhiqi Bu
Shiyun Xu
Kan Chen
33
17
0
25 Oct 2020
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime
Andrea Agazzi
Jianfeng Lu
13
15
0
22 Oct 2020
Deep Learning is Singular, and That's Good
Daniel Murfet
Susan Wei
Biwei Huang
Hui Li
Jesse Gell-Redman
T. Quella
UQCV
24
26
0
22 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
18
80
0
06 Oct 2020
On the linearity of large non-linear models: when and why the tangent kernel is constant
Chaoyue Liu
Libin Zhu
M. Belkin
21
140
0
02 Oct 2020
Neural Thompson Sampling
Weitong Zhang
Dongruo Zhou
Lihong Li
Quanquan Gu
34
115
0
02 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes
A. Bietti
Francis R. Bach
30
86
0
30 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Keyulu Xu
Mozhi Zhang
Jingling Li
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
MLT
25
306
0
24 Sep 2020
Tensor Programs III: Neural Matrix Laws
Greg Yang
14
44
0
22 Sep 2020
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS
Lin Chen
Sheng Xu
32
93
0
22 Sep 2020
Generalized Leverage Score Sampling for Neural Networks
J. Lee
Ruoqi Shen
Zhao Song
Mengdi Wang
Zheng Yu
21
42
0
21 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
32
159
0
07 Sep 2020
Predicting Training Time Without Training
L. Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
26
24
0
28 Aug 2020
Deep Networks and the Multiple Manifold Problem
Sam Buchanan
D. Gilboa
John N. Wright
166
39
0
25 Aug 2020
Multiple Descent: Design Your Own Generalization Curve
Lin Chen
Yifei Min
M. Belkin
Amin Karbasi
DRL
33
61
0
03 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
21
42
0
02 Aug 2020
The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training
Andrea Montanari
Yiqiao Zhong
49
95
0
25 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
J. Lee
Nathan Srebro
Daniel Soudry
35
85
0
13 Jul 2020
Weak error analysis for stochastic gradient descent optimization algorithms
A. Bercher
Lukas Gonon
Arnulf Jentzen
Diyora Salimova
28
4
0
03 Jul 2020
On the Similarity between the Laplace and Neural Tangent Kernels
Amnon Geifman
A. Yadav
Yoni Kasten
Meirav Galun
David Jacobs
Ronen Basri
23
89
0
03 Jul 2020
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach
Luofeng Liao
You-Lin Chen
Zhuoran Yang
Bo Dai
Zhaoran Wang
Mladen Kolar
30
33
0
02 Jul 2020
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders
Yibo Jiang
Cengiz Pehlevan
19
13
0
30 Jun 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
58
135
0
25 Jun 2020
Logarithmic Pruning is All You Need
Laurent Orseau
Marcus Hutter
Omar Rivasplata
28
88
0
22 Jun 2020
DO-Conv: Depthwise Over-parameterized Convolutional Layer
Jinming Cao
Yangyan Li
Mingchao Sun
Ying-Cong Chen
Dani Lischinski
Daniel Cohen-Or
Baoquan Chen
Changhe Tu
OOD
33
166
0
22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
29
82
0
20 Jun 2020
Exploring Weight Importance and Hessian Bias in Model Pruning
Mingchen Li
Yahya Sattar
Christos Thrampoulidis
Samet Oymak
28
3
0
19 Jun 2020
An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives
Qi Qi
Zhishuai Guo
Yi Tian Xu
Rong Jin
Tianbao Yang
33
44
0
17 Jun 2020
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting
Giorgos Bouritsas
Fabrizio Frasca
S. Zafeiriou
M. Bronstein
58
424
0
16 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Directional convergence and alignment in deep learning
Ziwei Ji
Matus Telgarsky
20
164
0
11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
141
11
0
08 Jun 2020
Is deeper better? It depends on locality of relevant features
Takashi Mori
Masahito Ueda
OOD
25
4
0
26 May 2020
Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks
Z. Fan
Zhichao Wang
44
71
0
25 May 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
AAML
39
147
0
20 May 2020
Orthogonal Over-Parameterized Training
Weiyang Liu
Rongmei Lin
Zhen Liu
James M. Rehg
Liam Paull
Li Xiong
Le Song
Adrian Weller
32
41
0
09 Apr 2020
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth
Yiping Lu
Chao Ma
Yulong Lu
Jianfeng Lu
Lexing Ying
MLT
39
78
0
11 Mar 2020
Frequency Bias in Neural Networks for Input of Non-Uniform Density
Ronen Basri
Meirav Galun
Amnon Geifman
David Jacobs
Yoni Kasten
S. Kritchman
42
183
0
10 Mar 2020
Previous
1
2
3
4
5
6
7
8
Next