ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.06509
  4. Cited By
On the Optimization of Deep Networks: Implicit Acceleration by
  Overparameterization

On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization

19 February 2018
Sanjeev Arora
Nadav Cohen
Elad Hazan
ArXivPDFHTML

Papers citing "On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization"

29 / 129 papers shown
Title
Overparameterized Neural Networks Implement Associative Memory
Overparameterized Neural Networks Implement Associative Memory
Adityanarayanan Radhakrishnan
M. Belkin
Caroline Uhler
BDL
35
71
0
26 Sep 2019
Meta-Learning with Warped Gradient Descent
Meta-Learning with Warped Gradient Descent
Sebastian Flennerhag
Andrei A. Rusu
Razvan Pascanu
Francesco Visin
Hujun Yin
R. Hadsell
8
209
0
30 Aug 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
42
51
0
24 Jul 2019
Associated Learning: Decomposing End-to-end Backpropagation based on
  Auto-encoders and Target Propagation
Associated Learning: Decomposing End-to-end Backpropagation based on Auto-encoders and Target Propagation
Yu-Wei Kao
Hung-Hsuan Chen
BDL
20
5
0
13 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Implicit Regularization in Deep Matrix Factorization
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
38
493
0
31 May 2019
On the Expressive Power of Deep Polynomial Neural Networks
On the Expressive Power of Deep Polynomial Neural Networks
Joe Kileel
Matthew Trager
Joan Bruna
27
82
0
29 May 2019
Optimisation of Overparametrized Sum-Product Networks
Optimisation of Overparametrized Sum-Product Networks
Martin Trapp
Robert Peharz
Franz Pernkopf
TPM
11
4
0
20 May 2019
Every Local Minimum Value is the Global Minimum Value of Induced Model
  in Non-convex Machine Learning
Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning
Kenji Kawaguchi
Jiaoyang Huang
L. Kaelbling
AAML
24
18
0
07 Apr 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise
  for Overparameterized Neural Networks
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
47
351
0
27 Mar 2019
Neural Empirical Bayes
Neural Empirical Bayes
Saeed Saremi
Aapo Hyvarinen
12
65
0
06 Mar 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with
  Structured Covariance Noise
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
20
22
0
21 Feb 2019
Understanding over-parameterized deep networks by geometrization
Understanding over-parameterized deep networks by geometrization
Xiao Dong
Ling Zhou
GNN
AI4CE
21
7
0
11 Feb 2019
Stiffness: A New Perspective on Generalization in Neural Networks
Stiffness: A New Perspective on Generalization in Neural Networks
Stanislav Fort
Pawel Krzysztof Nowak
Stanislaw Jastrzebski
S. Narayanan
24
94
0
28 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
23
94
0
24 Jan 2019
Gradient Descent Happens in a Tiny Subspace
Gradient Descent Happens in a Tiny Subspace
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
30
229
0
12 Dec 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU
  Networks
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
33
446
0
21 Nov 2018
Regularization Matters: Generalization and Optimization of Neural Nets
  v.s. their Induced Kernel
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
J. Lee
Qiang Liu
Tengyu Ma
26
245
0
12 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural
  Networks
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
27
281
0
04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
56
1,252
0
04 Oct 2018
Gradient descent aligns the layers of deep linear networks
Gradient descent aligns the layers of deep linear networks
Ziwei Ji
Matus Telgarsky
30
248
0
04 Oct 2018
Exponential Convergence Time of Gradient Descent for One-Dimensional
  Deep Linear Neural Networks
Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks
Ohad Shamir
35
45
0
23 Sep 2018
On the Learning Dynamics of Deep Neural Networks
On the Learning Dynamics of Deep Neural Networks
Rémi Tachet des Combes
Mohammad Pezeshki
Samira Shabanian
Aaron Courville
Yoshua Bengio
16
38
0
18 Sep 2018
Filter Distillation for Network Compression
Filter Distillation for Network Compression
Xavier Suau
Luca Zappella
N. Apostoloff
24
38
0
20 Jul 2018
ResNet with one-neuron hidden layers is a Universal Approximator
ResNet with one-neuron hidden layers is a Universal Approximator
Hongzhou Lin
Stefanie Jegelka
43
227
0
28 Jun 2018
Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex
Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex
Hongyang R. Zhang
Junru Shao
Ruslan Salakhutdinov
39
14
0
06 Jun 2018
Algorithmic Regularization in Learning Deep Homogeneous Models: Layers
  are Automatically Balanced
Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced
S. Du
Wei Hu
J. Lee
MLT
40
237
0
04 Jun 2018
High-dimensional dynamics of generalization error in neural networks
High-dimensional dynamics of generalization error in neural networks
Madhu S. Advani
Andrew M. Saxe
AI4CE
90
464
0
10 Oct 2017
A Differential Equation for Modeling Nesterov's Accelerated Gradient
  Method: Theory and Insights
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
Weijie Su
Stephen P. Boyd
Emmanuel J. Candes
108
1,157
0
04 Mar 2015
The Loss Surfaces of Multilayer Networks
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
183
1,186
0
30 Nov 2014
Previous
123