v1v2v3 (latest)

Width Provably Matters in Optimization for Deep Linear Neural Networks

24 January 2019

S. Du

Wei Hu

ArXiv (abs)PDF HTML

Papers citing "Width Provably Matters in Optimization for Deep Linear Neural Networks"

20 / 70 papers shown

Which Minimizer Does My Neural Network Converge To?

193

04 Nov 2020

A Unifying View on Implicit Bias in Training Linear Neural NetworksInternational Conference on Learning Representations (ICLR), 2020

461

06 Oct 2020

A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear NetworkInternational Conference on Machine Learning (ICML), 2020

Jun-Kun Wang

Chi-Heng Lin

Jacob D. Abernethy

600

04 Oct 2020

Deep matrix factorizationsComputer Science Review (CSR), 2020

Pierre De Handschutter

Nicolas Gillis

Xavier Siebert

BDL

421

01 Oct 2020

Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learningNeural Information Processing Systems (NeurIPS), 2020

Chandrashekar Lakshminarayanan

Amit Singh

AI4CE

177

11 Jun 2020

Analysis of Knowledge Transfer in Kernel RegimeInternational Conference on Information and Knowledge Management (CIKM), 2020

Arman Rahbar

Ashkan Panahi

Chiranjib Bhattacharyya

Devdatt Dubhashi

M. Chehreghani

167

30 Mar 2020

On the Global Convergence of Training Deep Linear ResNetsInternational Conference on Learning Representations (ICLR), 2020

Difan Zou

Philip M. Long

Quanquan Gu

186

02 Mar 2020

Revealing the Structure of Deep Neural Networks via Convex DualityInternational Conference on Machine Learning (ICML), 2020

Tolga Ergen

Mert Pilanci

MLT

418

22 Feb 2020

Deep Gated Networks: A framework to understand training and generalisation in deep learning

Chandrashekar Lakshminarayanan

Amit Singh

AI4CE

10 Feb 2020

Distribution Approximation and Statistical Estimation Guarantees of Generative Adversarial Networks

265

10 Feb 2020

Quasi-Equivalence of Width and Depth of Neural Networks

Fenglei Fan

Rongjie Lai

Ge Wang

569

06 Feb 2020

Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear NetworksInternational Conference on Learning Representations (ICLR), 2020

Wei Hu

Lechao Xiao

Jeffrey Pennington

198

128

16 Jan 2020

Global Convergence of Gradient Descent for Deep Linear Residual NetworksNeural Information Processing Systems (NeurIPS), 2019

Lei Wu

Qingcan Wang

Chao Ma

ODL AI4CE

235

02 Nov 2019

Effects of Depth, Width, and Initialization: A Convergence Analysis of Layer-wise Training for Deep Linear Neural NetworksAnalysis and Applications (Anal. Appl.), 2019

Yeonjong Shin

284

14 Oct 2019

Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound

Zhao Song

Xin Yang

150

09 Jun 2019

Implicit Regularization in Deep Matrix FactorizationNeural Information Processing Systems (NeurIPS), 2019

392

561

31 May 2019

On Exact Computation with an Infinitely Wide Neural Net

641

991

26 Apr 2019

Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections

Chao Ma

267

10 Apr 2019

Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning

210

07 Apr 2019

Elimination of All Bad Local Minima in Deep Learning

Kenji Kawaguchi

L. Kaelbling

308

02 Jan 2019