v1v2 (latest)

Can SGD Learn Recurrent Neural Networks with Provable Generalization?

4 February 2019

Papers citing "Can SGD Learn Recurrent Neural Networks with Provable Generalization?"

26 / 26 papers shown

Title
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation Can Jin Ying Li Mingyu Zhao Shiyu Zhao Zhenting Wang Xiaoxiao He Ligong Han Tong Che Dimitris N. Metaxas VPVLM VLM 335 2 0 02 Feb 2025
Liquid Structural State-Space Models Ramin Hasani Mathias Lechner Tsun-Hsuan Wang Makram Chahine Alexander Amini Daniela Rus AI4TS 155 107 0 26 Sep 2022
On the Provable Generalization of Recurrent Neural Networks Lifu Wang Bo Shen Bo Hu Xing Cao 144 8 0 29 Sep 2021
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs Satyen Kale Ayush Sekhari Karthik Sridharan 259 29 0 11 Jul 2021
Characterization of Generalizability of Spike Timing Dependent Plasticity trained Spiking Neural Networks Biswadeep Chakraborty Saibal Mukhopadhyay 125 15 0 31 May 2021
Recent advances in deep learning theory Fengxiang He Dacheng Tao AI4CE 130 51 0 20 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu Yuanzhi Li FedML 187 376 0 17 Dec 2020
Recent Theoretical Advances in Non-Convex Optimization Marina Danilova Pavel Dvurechensky Alexander Gasnikov Eduard A. Gorbunov Sergey Guminov Dmitry Kamzolov Innokentiy Shibaev 129 79 0 11 Dec 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK Yuanzhi Li Tengyu Ma Hongyang R. Zhang MLT 95 27 0 09 Jul 2020
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks Umut Simsekli Ozan Sener George Deligiannidis Murat A. Erdogdu 86 56 0 16 Jun 2020
Learning Long-Term Dependencies in Irregularly-Sampled Time Series Mathias Lechner Ramin Hasani AI4TS 71 132 0 08 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning Zeyuan Allen-Zhu Yuanzhi Li MLT AAML 122 151 0 20 May 2020
Disentangling Adaptive Gradient Methods from Learning Rates Naman Agarwal Rohan Anil Elad Hazan Tomer Koren Cyril Zhang 109 38 0 26 Feb 2020
Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? -- A Neural Tangent Kernel Perspective Kaixuan Huang Yuqing Wang Molei Tao T. Zhao MLT 62 98 0 14 Feb 2020
Generalization and Representational Limits of Graph Neural Networks Vikas Garg Stefanie Jegelka Tommi Jaakkola GNN 108 314 0 14 Feb 2020
Constructing Gradient Controllable Recurrent Neural Networks Using Hamiltonian Dynamics Konstantin Rusch J. Pearson K. Zygalakis 34 0 0 11 Nov 2019
Machine Learning for Prediction with Missing Dynamics J. Harlim Shixiao W. Jiang Senwei Liang Haizhao Yang AI4CE 71 61 0 13 Oct 2019
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks Ziwei Ji Matus Telgarsky 98 178 0 26 Sep 2019
Convex Programming for Estimation in Nonlinear Recurrent Models S. Bahmani Justin Romberg 57 10 0 26 Aug 2019
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks Yuanzhi Li Colin Wei Tengyu Ma 90 300 0 10 Jul 2019
Learning in Gated Neural Networks Ashok Vardhan Makkuva Sewoong Oh Sreeram Kannan Pramod Viswanath 50 11 0 06 Jun 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels? Zeyuan Allen-Zhu Yuanzhi Li 416 183 0 24 May 2019
A Selective Overview of Deep Learning Jianqing Fan Cong Ma Yiqiao Zhong BDL VLM 206 135 0 10 Apr 2019
On the Power and Limitations of Random Features for Understanding Neural Networks Gilad Yehudai Ohad Shamir MLT 125 182 0 01 Apr 2019
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers Zeyuan Allen-Zhu Yuanzhi Li Yingyu Liang MLT 235 775 0 12 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks Zeyuan Allen-Zhu Yuanzhi Li Zhao Song 253 193 0 29 Oct 2018