Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

21 November 2018

Quanquan Gu

Papers citing "Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks"

41 / 91 papers shown

Title
Predicting Training Time Without Training L. Zancato Alessandro Achille Avinash Ravichandran Rahul Bhotika Stefano Soatto 18 24 0 28 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy Zuyue Fu Zhuoran Yang Zhaoran Wang 15 42 0 02 Aug 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy E. Moroshko Suriya Gunasekar Blake E. Woodworth J. Lee Nathan Srebro Daniel Soudry 27 85 0 13 Jul 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture Greg Yang 40 134 0 25 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory Yufeng Zhang Qi Cai Zhuoran Yang Yongxin Chen Zhaoran Wang OOD MLT 72 11 0 08 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning Zeyuan Allen-Zhu Yuanzhi Li MLT AAML 27 146 0 20 May 2020
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth Yiping Lu Chao Ma Yulong Lu Jianfeng Lu Lexing Ying MLT 33 78 0 11 Mar 2020
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 159 234 0 04 Mar 2020
Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning Zixin Wen SSL 21 2 0 17 Feb 2020
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality Yi Zhang Orestis Plevrakis S. Du Xingguo Li Zhao-quan Song Sanjeev Arora 21 51 0 16 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations Roman Vershynin 26 21 0 20 Jan 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity Shiyu Liang Ruoyu Sun R. Srikant 25 19 0 31 Dec 2019
Towards Understanding the Spectral Bias of Deep Learning Yuan Cao Zhiying Fang Yue Wu Ding-Xuan Zhou Quanquan Gu 29 214 0 03 Dec 2019
Neural Contextual Bandits with UCB-based Exploration Dongruo Zhou Lihong Li Quanquan Gu 22 15 0 11 Nov 2019
Enhanced Convolutional Neural Tangent Kernels Zhiyuan Li Ruosong Wang Dingli Yu S. Du Wei Hu Ruslan Salakhutdinov Sanjeev Arora 16 131 0 03 Nov 2019
Global Convergence of Gradient Descent for Deep Linear Residual Networks Lei Wu Qingcan Wang Chao Ma ODL AI4CE 20 22 0 02 Nov 2019
Growing axons: greedy learning of neural networks with application to function approximation Daria Fokina Ivan V. Oseledets 11 18 0 28 Oct 2019
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks Sanjeev Arora S. Du Zhiyuan Li Ruslan Salakhutdinov Ruosong Wang Dingli Yu AAML 9 161 0 03 Oct 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks Yu Bai J. Lee 11 116 0 03 Oct 2019
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction Pan Xu F. Gao Quanquan Gu 23 83 0 18 Sep 2019
The generalization error of random features regression: Precise asymptotics and double descent curve Song Mei Andrea Montanari 39 626 0 14 Aug 2019
Kernel and Rich Regimes in Overparametrized Models Blake E. Woodworth Suriya Gunasekar Pedro H. P. Savarese E. Moroshko Itay Golan J. Lee Daniel Soudry Nathan Srebro 19 353 0 13 Jun 2019
Generalization bounds for deep convolutional neural networks Philip M. Long Hanie Sedghi MLT 37 89 0 29 May 2019
Norm-based generalisation bounds for multi-class convolutional neural networks Antoine Ledent Waleed Mustafa Yunwen Lei Marius Kloft 12 5 0 29 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems Tianle Cai Ruiqi Gao Jikai Hou Siyu Chen Dong Wang Di He Zhihua Zhang Liwei Wang ODL 16 57 0 28 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels? Zeyuan Allen-Zhu Yuanzhi Li 24 183 0 24 May 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems Atsushi Nitanda Geoffrey Chinot Taiji Suzuki MLT 13 33 0 23 May 2019
A type of generalization error induced by initialization in deep neural networks Yaoyu Zhang Zhi-Qin John Xu Tao Luo Zheng Ma 9 49 0 19 May 2019
Linearized two-layers neural networks in high dimension Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari MLT 13 241 0 27 Apr 2019
On Exact Computation with an Infinitely Wide Neural Net Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruslan Salakhutdinov Ruosong Wang 24 899 0 26 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections E. Weinan Chao Ma Qingcan Wang Lei Wu MLT 27 22 0 10 Apr 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks Mingchen Li Mahdi Soltanolkotabi Samet Oymak NoLa 26 350 0 27 Mar 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruosong Wang MLT 35 961 0 24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks S. Du Wei Hu 16 93 0 24 Jan 2019
Gradient Descent Finds Global Minima of Deep Neural Networks S. Du J. Lee Haochuan Li Liwei Wang M. Tomizuka ODL 18 1,120 0 09 Nov 2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity Chulhee Yun S. Sra Ali Jadbabaie 13 117 0 17 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel Colin Wei J. Lee Qiang Liu Tengyu Ma 18 243 0 12 Oct 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent Xiao Zhang Yaodong Yu Lingxiao Wang Quanquan Gu MLT 26 134 0 20 Jun 2018
Benefits of depth in neural networks Matus Telgarsky 136 602 0 14 Feb 2016
Norm-Based Capacity Control in Neural Networks Behnam Neyshabur Ryota Tomioka Nathan Srebro 119 577 0 27 Feb 2015
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 179 1,185 0 30 Nov 2014