Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.06509
Cited By
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
19 February 2018
Sanjeev Arora
Nadav Cohen
Elad Hazan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization"
29 / 129 papers shown
Title
Overparameterized Neural Networks Implement Associative Memory
Adityanarayanan Radhakrishnan
M. Belkin
Caroline Uhler
BDL
35
71
0
26 Sep 2019
Meta-Learning with Warped Gradient Descent
Sebastian Flennerhag
Andrei A. Rusu
Razvan Pascanu
Francesco Visin
Hujun Yin
R. Hadsell
8
209
0
30 Aug 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
42
51
0
24 Jul 2019
Associated Learning: Decomposing End-to-end Backpropagation based on Auto-encoders and Target Propagation
Yu-Wei Kao
Hung-Hsuan Chen
BDL
20
5
0
13 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
38
493
0
31 May 2019
On the Expressive Power of Deep Polynomial Neural Networks
Joe Kileel
Matthew Trager
Joan Bruna
27
82
0
29 May 2019
Optimisation of Overparametrized Sum-Product Networks
Martin Trapp
Robert Peharz
Franz Pernkopf
TPM
11
4
0
20 May 2019
Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning
Kenji Kawaguchi
Jiaoyang Huang
L. Kaelbling
AAML
24
18
0
07 Apr 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
47
351
0
27 Mar 2019
Neural Empirical Bayes
Saeed Saremi
Aapo Hyvarinen
12
65
0
06 Mar 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
20
22
0
21 Feb 2019
Understanding over-parameterized deep networks by geometrization
Xiao Dong
Ling Zhou
GNN
AI4CE
21
7
0
11 Feb 2019
Stiffness: A New Perspective on Generalization in Neural Networks
Stanislav Fort
Pawel Krzysztof Nowak
Stanislaw Jastrzebski
S. Narayanan
24
94
0
28 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
23
94
0
24 Jan 2019
Gradient Descent Happens in a Tiny Subspace
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
30
229
0
12 Dec 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
33
446
0
21 Nov 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
J. Lee
Qiang Liu
Tengyu Ma
26
245
0
12 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
27
281
0
04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
56
1,252
0
04 Oct 2018
Gradient descent aligns the layers of deep linear networks
Ziwei Ji
Matus Telgarsky
30
248
0
04 Oct 2018
Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks
Ohad Shamir
35
45
0
23 Sep 2018
On the Learning Dynamics of Deep Neural Networks
Rémi Tachet des Combes
Mohammad Pezeshki
Samira Shabanian
Aaron Courville
Yoshua Bengio
16
38
0
18 Sep 2018
Filter Distillation for Network Compression
Xavier Suau
Luca Zappella
N. Apostoloff
24
38
0
20 Jul 2018
ResNet with one-neuron hidden layers is a Universal Approximator
Hongzhou Lin
Stefanie Jegelka
43
227
0
28 Jun 2018
Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex
Hongyang R. Zhang
Junru Shao
Ruslan Salakhutdinov
39
14
0
06 Jun 2018
Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced
S. Du
Wei Hu
J. Lee
MLT
40
237
0
04 Jun 2018
High-dimensional dynamics of generalization error in neural networks
Madhu S. Advani
Andrew M. Saxe
AI4CE
90
464
0
10 Oct 2017
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
Weijie Su
Stephen P. Boyd
Emmanuel J. Candes
108
1,157
0
04 Mar 2015
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
183
1,186
0
30 Nov 2014
Previous
1
2
3