ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.01204
  4. Cited By
Learning Overparameterized Neural Networks via Stochastic Gradient
  Descent on Structured Data

Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data

3 August 2018
Yuanzhi Li
Yingyu Liang
    MLT
ArXivPDFHTML

Papers citing "Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data"

28 / 128 papers shown
Title
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Sanjeev Arora
S. Du
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
Dingli Yu
AAML
11
161
0
03 Oct 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of
  Wide Neural Networks
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
Yu Bai
J. Lee
11
116
0
03 Oct 2019
How does topology influence gradient propagation and model performance
  of deep networks with DenseNet-type skip connections?
How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?
Kartikeya Bhardwaj
Guihong Li
R. Marculescu
30
1
0
02 Oct 2019
The generalization error of random features regression: Precise
  asymptotics and double descent curve
The generalization error of random features regression: Precise asymptotics and double descent curve
Song Mei
Andrea Montanari
39
626
0
14 Aug 2019
Kernel and Rich Regimes in Overparametrized Models
Blake E. Woodworth
Suriya Gunasekar
Pedro H. P. Savarese
E. Moroshko
Itay Golan
J. Lee
Daniel Soudry
Nathan Srebro
19
353
0
13 Jun 2019
Parameterized Structured Pruning for Deep Neural Networks
Parameterized Structured Pruning for Deep Neural Networks
Günther Schindler
Wolfgang Roth
Franz Pernkopf
Holger Froening
16
6
0
12 Jun 2019
Enhancing Adversarial Defense by k-Winners-Take-All
Enhancing Adversarial Defense by k-Winners-Take-All
Chang Xiao
Peilin Zhong
Changxi Zheng
AAML
11
97
0
25 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
24
183
0
24 May 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural
  Networks on Classification Problems
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems
Atsushi Nitanda
Geoffrey Chinot
Taiji Suzuki
MLT
13
33
0
23 May 2019
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
24
900
0
26 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network
  Model with Skip-connections
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections
E. Weinan
Chao Ma
Qingcan Wang
Lei Wu
MLT
27
22
0
10 Apr 2019
Many Task Learning with Task Routing
Many Task Learning with Task Routing
Gjorgji Strezoski
N. V. Noord
M. Worring
MoE
16
96
0
28 Mar 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise
  for Overparameterized Neural Networks
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
28
350
0
27 Mar 2019
A Priori Estimates of the Population Risk for Residual Networks
A Priori Estimates of the Population Risk for Residual Networks
E. Weinan
Chao Ma
Qingcan Wang
UQCV
29
61
0
06 Mar 2019
Fine-Grained Analysis of Optimization and Generalization for
  Overparameterized Two-Layer Neural Networks
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
35
961
0
24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
16
93
0
24 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU
  Networks
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
22
446
0
21 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
J. Lee
Haochuan Li
Liwei Wang
M. Tomizuka
ODL
21
1,120
0
09 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao-quan Song
16
191
0
29 Oct 2018
Small ReLU networks are powerful memorizers: a tight analysis of
  memorization capacity
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun
S. Sra
Ali Jadbabaie
13
117
0
17 Oct 2018
Learning Two-layer Neural Networks with Symmetric Inputs
Learning Two-layer Neural Networks with Symmetric Inputs
Rong Ge
Rohith Kuditipudi
Zhize Li
Xiang Wang
OOD
MLT
28
57
0
16 Oct 2018
A Priori Estimates of the Population Risk for Two-layer Neural Networks
A Priori Estimates of the Population Risk for Two-layer Neural Networks
Weinan E
Chao Ma
Lei Wu
27
130
0
15 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets
  v.s. their Induced Kernel
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
J. Lee
Qiang Liu
Tengyu Ma
18
243
0
12 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
33
1,251
0
04 Oct 2018
Learning ReLU Networks on Linearly Separable Data: Algorithm,
  Optimality, and Generalization
Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization
G. Wang
G. Giannakis
Jie Chen
MLT
22
131
0
14 Aug 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent
Learning One-hidden-layer ReLU Networks via Gradient Descent
Xiao Zhang
Yaodong Yu
Lingxiao Wang
Quanquan Gu
MLT
26
134
0
20 Jun 2018
When Will Gradient Methods Converge to Max-margin Classifier under ReLU
  Models?
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models?
Tengyu Xu
Yi Zhou
Kaiyi Ji
Yingbin Liang
21
19
0
12 Jun 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,889
0
15 Sep 2016
Previous
123