Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1702.08503
Cited By
SGD Learns the Conjugate Kernel Class of the Network
27 February 2017
Amit Daniely
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SGD Learns the Conjugate Kernel Class of the Network"
30 / 130 papers shown
Title
Decoupling Gating from Linearity
Jonathan Fiat
Eran Malach
Shai Shalev-Shwartz
17
28
0
12 Jun 2019
Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks
M. Zhu
Xiao-Yang Liu
Xiaodong Wang
30
16
0
12 Jun 2019
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Zhao Song
Xin Yang
23
91
0
09 Jun 2019
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Yuan Cao
Quanquan Gu
MLT
AI4CE
37
383
0
30 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
24
183
0
24 May 2019
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Qi Cai
Zhuoran Yang
Jason D. Lee
Zhaoran Wang
42
29
0
24 May 2019
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
65
906
0
26 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections
E. Weinan
Chao Ma
Qingcan Wang
Lei Wu
MLT
37
22
0
10 Apr 2019
A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics
E. Weinan
Chao Ma
Lei Wu
MLT
27
122
0
08 Apr 2019
On the Power and Limitations of Random Features for Understanding Neural Networks
Gilad Yehudai
Ohad Shamir
MLT
28
181
0
01 Apr 2019
Theory III: Dynamics and Generalization in Deep Networks
Andrzej Banburski
Q. Liao
Brando Miranda
Lorenzo Rosasco
Fernanda De La Torre
Jack Hidary
T. Poggio
AI4CE
35
3
0
12 Mar 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
57
1,080
0
18 Feb 2019
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks
Yuan Cao
Quanquan Gu
ODL
MLT
AI4CE
25
155
0
04 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
55
961
0
24 Jan 2019
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
41
765
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
72
1,448
0
09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
44
1,126
0
09 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
32
191
0
29 Oct 2018
A Priori Estimates of the Population Risk for Two-layer Neural Networks
Weinan E
Chao Ma
Lei Wu
29
130
0
15 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
Jason D. Lee
Qiang Liu
Tengyu Ma
28
245
0
12 Oct 2018
Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem
Alon Brutzkus
Amir Globerson
MLT
19
7
0
06 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
65
1,252
0
04 Oct 2018
Gradient Descent for One-Hidden-Layer Neural Networks: Polynomial Convergence and SQ Lower Bounds
Santosh Vempala
John Wilmes
MLT
22
50
0
07 May 2018
A Provably Correct Algorithm for Deep Learning that Actually Works
Eran Malach
Shai Shalev-Shwartz
MLT
24
30
0
26 Mar 2018
Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks
Peter L. Bartlett
D. Helmbold
Philip M. Long
36
116
0
16 Feb 2018
Size-Independent Sample Complexity of Neural Networks
Noah Golowich
Alexander Rakhlin
Ohad Shamir
23
545
0
18 Dec 2017
Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks
Surbhi Goel
Adam R. Klivans
30
27
0
11 Aug 2017
Weight Sharing is Crucial to Succesful Optimization
Shai Shalev-Shwartz
Ohad Shamir
Shaked Shammah
44
12
0
02 Jun 2017
Random Features for Compositional Kernels
Amit Daniely
Roy Frostig
Vineet Gupta
Y. Singer
CoGe
24
19
0
22 Mar 2017
Convergence Results for Neural Networks via Electrodynamics
Rina Panigrahy
Sushant Sachdeva
Qiuyi Zhang
MLT
MDE
32
22
0
01 Feb 2017
Previous
1
2
3