ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.08572
  4. Cited By
Width Provably Matters in Optimization for Deep Linear Neural Networks
v1v2v3 (latest)

Width Provably Matters in Optimization for Deep Linear Neural Networks

24 January 2019
S. Du
Wei Hu
ArXiv (abs)PDFHTML

Papers citing "Width Provably Matters in Optimization for Deep Linear Neural Networks"

20 / 70 papers shown
Which Minimizer Does My Neural Network Converge To?
Which Minimizer Does My Neural Network Converge To?
Manuel Nonnenmacher
David Reeb
Ingo Steinwart
ODL
193
5
0
04 Nov 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
A Unifying View on Implicit Bias in Training Linear Neural NetworksInternational Conference on Learning Representations (ICLR), 2020
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
461
90
0
06 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum:
  Training a Wide ReLU Network and a Deep Linear Network
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear NetworkInternational Conference on Machine Learning (ICML), 2020
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
600
24
0
04 Oct 2020
Deep matrix factorizations
Deep matrix factorizationsComputer Science Review (CSR), 2020
Pierre De Handschutter
Nicolas Gillis
Xavier Siebert
BDL
421
56
0
01 Oct 2020
Neural Path Features and Neural Path Kernel : Understanding the role of
  gates in deep learning
Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learningNeural Information Processing Systems (NeurIPS), 2020
Chandrashekar Lakshminarayanan
Amit Singh
AI4CE
177
11
0
11 Jun 2020
Analysis of Knowledge Transfer in Kernel Regime
Analysis of Knowledge Transfer in Kernel RegimeInternational Conference on Information and Knowledge Management (CIKM), 2020
Arman Rahbar
Ashkan Panahi
Chiranjib Bhattacharyya
Devdatt Dubhashi
M. Chehreghani
167
4
0
30 Mar 2020
On the Global Convergence of Training Deep Linear ResNets
On the Global Convergence of Training Deep Linear ResNetsInternational Conference on Learning Representations (ICLR), 2020
Difan Zou
Philip M. Long
Quanquan Gu
186
41
0
02 Mar 2020
Revealing the Structure of Deep Neural Networks via Convex Duality
Revealing the Structure of Deep Neural Networks via Convex DualityInternational Conference on Machine Learning (ICML), 2020
Tolga Ergen
Mert Pilanci
MLT
418
74
0
22 Feb 2020
Deep Gated Networks: A framework to understand training and
  generalisation in deep learning
Deep Gated Networks: A framework to understand training and generalisation in deep learning
Chandrashekar Lakshminarayanan
Amit Singh
AI4CE
98
2
0
10 Feb 2020
Distribution Approximation and Statistical Estimation Guarantees of
  Generative Adversarial Networks
Distribution Approximation and Statistical Estimation Guarantees of Generative Adversarial Networks
Minshuo Chen
Wenjing Liao
H. Zha
Tuo Zhao
265
20
0
10 Feb 2020
Quasi-Equivalence of Width and Depth of Neural Networks
Quasi-Equivalence of Width and Depth of Neural Networks
Fenglei Fan
Rongjie Lai
Ge Wang
569
12
0
06 Feb 2020
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear
  Networks
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear NetworksInternational Conference on Learning Representations (ICLR), 2020
Wei Hu
Lechao Xiao
Jeffrey Pennington
198
128
0
16 Jan 2020
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Global Convergence of Gradient Descent for Deep Linear Residual NetworksNeural Information Processing Systems (NeurIPS), 2019
Lei Wu
Qingcan Wang
Chao Ma
ODLAI4CE
235
24
0
02 Nov 2019
Effects of Depth, Width, and Initialization: A Convergence Analysis of
  Layer-wise Training for Deep Linear Neural Networks
Effects of Depth, Width, and Initialization: A Convergence Analysis of Layer-wise Training for Deep Linear Neural NetworksAnalysis and Applications (Anal. Appl.), 2019
Yeonjong Shin
284
13
0
14 Oct 2019
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Zhao Song
Xin Yang
150
96
0
09 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Implicit Regularization in Deep Matrix FactorizationNeural Information Processing Systems (NeurIPS), 2019
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
392
561
0
31 May 2019
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
641
991
0
26 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network
  Model with Skip-connections
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections
E. Weinan
Chao Ma
Qingcan Wang
Lei Wu
MLT
267
22
0
10 Apr 2019
Every Local Minimum Value is the Global Minimum Value of Induced Model
  in Non-convex Machine Learning
Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning
Kenji Kawaguchi
Jiaoyang Huang
L. Kaelbling
AAML
210
19
0
07 Apr 2019
Elimination of All Bad Local Minima in Deep Learning
Elimination of All Bad Local Minima in Deep Learning
Kenji Kawaguchi
L. Kaelbling
308
48
0
02 Jan 2019
Previous
12