ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.01028
  4. Cited By
Can SGD Learn Recurrent Neural Networks with Provable Generalization?
v1v2 (latest)

Can SGD Learn Recurrent Neural Networks with Provable Generalization?

4 February 2019
Zeyuan Allen-Zhu
Yuanzhi Li
    MLTLRM
ArXiv (abs)PDFHTML

Papers citing "Can SGD Learn Recurrent Neural Networks with Provable Generalization?"

26 / 26 papers shown
Title
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
Can Jin
Ying Li
Mingyu Zhao
Shiyu Zhao
Zhenting Wang
Xiaoxiao He
Ligong Han
Tong Che
Dimitris N. Metaxas
VPVLMVLM
335
2
0
02 Feb 2025
Liquid Structural State-Space Models
Liquid Structural State-Space Models
Ramin Hasani
Mathias Lechner
Tsun-Hsuan Wang
Makram Chahine
Alexander Amini
Daniela Rus
AI4TS
155
107
0
26 Sep 2022
On the Provable Generalization of Recurrent Neural Networks
On the Provable Generalization of Recurrent Neural Networks
Lifu Wang
Bo Shen
Bo Hu
Xing Cao
144
8
0
29 Sep 2021
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs
Satyen Kale
Ayush Sekhari
Karthik Sridharan
259
29
0
11 Jul 2021
Characterization of Generalizability of Spike Timing Dependent
  Plasticity trained Spiking Neural Networks
Characterization of Generalizability of Spike Timing Dependent Plasticity trained Spiking Neural Networks
Biswadeep Chakraborty
Saibal Mukhopadhyay
125
15
0
31 May 2021
Recent advances in deep learning theory
Recent advances in deep learning theory
Fengxiang He
Dacheng Tao
AI4CE
130
51
0
20 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and
  Self-Distillation in Deep Learning
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
187
376
0
17 Dec 2020
Recent Theoretical Advances in Non-Convex Optimization
Recent Theoretical Advances in Non-Convex Optimization
Marina Danilova
Pavel Dvurechensky
Alexander Gasnikov
Eduard A. Gorbunov
Sergey Guminov
Dmitry Kamzolov
Innokentiy Shibaev
129
79
0
11 Dec 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
95
27
0
09 Jul 2020
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks
Umut Simsekli
Ozan Sener
George Deligiannidis
Murat A. Erdogdu
86
56
0
16 Jun 2020
Learning Long-Term Dependencies in Irregularly-Sampled Time Series
Learning Long-Term Dependencies in Irregularly-Sampled Time Series
Mathias Lechner
Ramin Hasani
AI4TS
71
132
0
08 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep
  Learning
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLTAAML
122
151
0
20 May 2020
Disentangling Adaptive Gradient Methods from Learning Rates
Disentangling Adaptive Gradient Methods from Learning Rates
Naman Agarwal
Rohan Anil
Elad Hazan
Tomer Koren
Cyril Zhang
109
38
0
26 Feb 2020
Why Do Deep Residual Networks Generalize Better than Deep Feedforward
  Networks? -- A Neural Tangent Kernel Perspective
Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? -- A Neural Tangent Kernel Perspective
Kaixuan Huang
Yuqing Wang
Molei Tao
T. Zhao
MLT
62
98
0
14 Feb 2020
Generalization and Representational Limits of Graph Neural Networks
Generalization and Representational Limits of Graph Neural Networks
Vikas Garg
Stefanie Jegelka
Tommi Jaakkola
GNN
108
314
0
14 Feb 2020
Constructing Gradient Controllable Recurrent Neural Networks Using Hamiltonian Dynamics
Konstantin Rusch
J. Pearson
K. Zygalakis
34
0
0
11 Nov 2019
Machine Learning for Prediction with Missing Dynamics
Machine Learning for Prediction with Missing Dynamics
J. Harlim
Shixiao W. Jiang
Senwei Liang
Haizhao Yang
AI4CE
71
61
0
13 Oct 2019
Polylogarithmic width suffices for gradient descent to achieve
  arbitrarily small test error with shallow ReLU networks
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks
Ziwei Ji
Matus Telgarsky
98
178
0
26 Sep 2019
Convex Programming for Estimation in Nonlinear Recurrent Models
Convex Programming for Estimation in Nonlinear Recurrent Models
S. Bahmani
Justin Romberg
57
10
0
26 Aug 2019
Towards Explaining the Regularization Effect of Initial Large Learning
  Rate in Training Neural Networks
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
Yuanzhi Li
Colin Wei
Tengyu Ma
90
300
0
10 Jul 2019
Learning in Gated Neural Networks
Learning in Gated Neural Networks
Ashok Vardhan Makkuva
Sewoong Oh
Sreeram Kannan
Pramod Viswanath
50
11
0
06 Jun 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
416
183
0
24 May 2019
A Selective Overview of Deep Learning
A Selective Overview of Deep Learning
Jianqing Fan
Cong Ma
Yiqiao Zhong
BDLVLM
206
135
0
10 Apr 2019
On the Power and Limitations of Random Features for Understanding Neural
  Networks
On the Power and Limitations of Random Features for Understanding Neural Networks
Gilad Yehudai
Ohad Shamir
MLT
125
182
0
01 Apr 2019
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
235
775
0
12 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
253
193
0
29 Oct 2018
1