ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1702.08503
  4. Cited By
SGD Learns the Conjugate Kernel Class of the Network

SGD Learns the Conjugate Kernel Class of the Network

27 February 2017
Amit Daniely
ArXivPDFHTML

Papers citing "SGD Learns the Conjugate Kernel Class of the Network"

30 / 130 papers shown
Title
Decoupling Gating from Linearity
Decoupling Gating from Linearity
Jonathan Fiat
Eran Malach
Shai Shalev-Shwartz
17
28
0
12 Jun 2019
Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted
  Vehicular Networks
Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks
M. Zhu
Xiao-Yang Liu
Xiaodong Wang
30
16
0
12 Jun 2019
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Zhao Song
Xin Yang
23
91
0
09 Jun 2019
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep
  Neural Networks
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Yuan Cao
Quanquan Gu
MLT
AI4CE
37
383
0
30 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
24
183
0
24 May 2019
Neural Temporal-Difference and Q-Learning Provably Converge to Global
  Optima
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Qi Cai
Zhuoran Yang
Jason D. Lee
Zhaoran Wang
42
29
0
24 May 2019
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
65
906
0
26 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network
  Model with Skip-connections
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections
E. Weinan
Chao Ma
Qingcan Wang
Lei Wu
MLT
37
22
0
10 Apr 2019
A Comparative Analysis of the Optimization and Generalization Property
  of Two-layer Neural Network and Random Feature Models Under Gradient Descent
  Dynamics
A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics
E. Weinan
Chao Ma
Lei Wu
MLT
27
122
0
08 Apr 2019
On the Power and Limitations of Random Features for Understanding Neural
  Networks
On the Power and Limitations of Random Features for Understanding Neural Networks
Gilad Yehudai
Ohad Shamir
MLT
28
181
0
01 Apr 2019
Theory III: Dynamics and Generalization in Deep Networks
Theory III: Dynamics and Generalization in Deep Networks
Andrzej Banburski
Q. Liao
Brando Miranda
Lorenzo Rosasco
Fernanda De La Torre
Jack Hidary
T. Poggio
AI4CE
35
3
0
12 Mar 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient
  Descent
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
57
1,080
0
18 Feb 2019
Generalization Error Bounds of Gradient Descent for Learning
  Over-parameterized Deep ReLU Networks
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks
Yuan Cao
Quanquan Gu
ODL
MLT
AI4CE
25
155
0
04 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for
  Overparameterized Two-Layer Neural Networks
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
55
961
0
24 Jan 2019
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
41
765
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
72
1,448
0
09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
44
1,126
0
09 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
32
191
0
29 Oct 2018
A Priori Estimates of the Population Risk for Two-layer Neural Networks
A Priori Estimates of the Population Risk for Two-layer Neural Networks
Weinan E
Chao Ma
Lei Wu
29
130
0
15 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets
  v.s. their Induced Kernel
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
Jason D. Lee
Qiang Liu
Tengyu Ma
28
245
0
12 Oct 2018
Why do Larger Models Generalize Better? A Theoretical Perspective via
  the XOR Problem
Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem
Alon Brutzkus
Amir Globerson
MLT
19
7
0
06 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
65
1,252
0
04 Oct 2018
Gradient Descent for One-Hidden-Layer Neural Networks: Polynomial
  Convergence and SQ Lower Bounds
Gradient Descent for One-Hidden-Layer Neural Networks: Polynomial Convergence and SQ Lower Bounds
Santosh Vempala
John Wilmes
MLT
22
50
0
07 May 2018
A Provably Correct Algorithm for Deep Learning that Actually Works
A Provably Correct Algorithm for Deep Learning that Actually Works
Eran Malach
Shai Shalev-Shwartz
MLT
24
30
0
26 Mar 2018
Gradient descent with identity initialization efficiently learns
  positive definite linear transformations by deep residual networks
Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks
Peter L. Bartlett
D. Helmbold
Philip M. Long
36
116
0
16 Feb 2018
Size-Independent Sample Complexity of Neural Networks
Size-Independent Sample Complexity of Neural Networks
Noah Golowich
Alexander Rakhlin
Ohad Shamir
23
545
0
18 Dec 2017
Eigenvalue Decay Implies Polynomial-Time Learnability for Neural
  Networks
Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks
Surbhi Goel
Adam R. Klivans
30
27
0
11 Aug 2017
Weight Sharing is Crucial to Succesful Optimization
Weight Sharing is Crucial to Succesful Optimization
Shai Shalev-Shwartz
Ohad Shamir
Shaked Shammah
44
12
0
02 Jun 2017
Random Features for Compositional Kernels
Random Features for Compositional Kernels
Amit Daniely
Roy Frostig
Vineet Gupta
Y. Singer
CoGe
24
19
0
22 Mar 2017
Convergence Results for Neural Networks via Electrodynamics
Convergence Results for Neural Networks via Electrodynamics
Rina Panigrahy
Sushant Sachdeva
Qiuyi Zhang
MLT
MDE
32
22
0
01 Feb 2017
Previous
123