Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1810.02281
Cited By
v1
v2
v3 (latest)
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
4 October 2018
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks"
50 / 209 papers shown
Title
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
229
3
0
28 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Devansh Bisla
Jing Wang
A. Choromańska
267
45
0
20 Jan 2022
Global Convergence Analysis of Deep Linear Networks with A One-neuron Layer
Kun Chen
Dachao Lin
Zhihua Zhang
120
1
0
08 Jan 2022
Over-Parametrized Matrix Factorization in the Presence of Spurious Stationary Points
IEEE Transactions on Signal Processing (IEEE TSP), 2021
Armin Eftekhari
106
1
0
25 Dec 2021
Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks
Pascal Esser
L. C. Vankadara
Debarghya Ghoshdastidar
133
62
0
07 Dec 2021
Error Bounds for a Matrix-Vector Product Approximation with Deep ReLU Neural Networks
T. Getu
169
2
0
25 Nov 2021
SGD Through the Lens of Kolmogorov Complexity
Gregory Schwartzman
165
1
0
10 Nov 2021
PAC-Bayesian Learning of Aggregated Binary Activated Neural Networks with Probabilities over Representations
Louis Fortier-Dubois
Gaël Letarte
Benjamin Leblanc
Franccois Laviolette
Pascal Germain
UQCV
228
1
0
28 Oct 2021
Convergence Analysis and Implicit Regularization of Feedback Alignment for Deep Linear Networks
M. Girotti
Alexia Jolicoeur-Martineau
Gauthier Gidel
134
1
0
20 Oct 2021
A global convergence theory for deep ReLU implicit networks via over-parameterization
International Conference on Learning Representations (ICLR), 2021
Tianxiang Gao
Hailiang Liu
Jia Liu
Hridesh Rajan
Hongyang Gao
MLT
167
17
0
11 Oct 2021
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Neural Information Processing Systems (NeurIPS), 2021
Jiayao Zhang
Hua Wang
Weijie J. Su
187
9
0
11 Oct 2021
Towards Demystifying Representation Learning with Non-contrastive Self-supervision
Xiang Wang
Xinlei Chen
S. Du
Yuandong Tian
SSL
139
30
0
11 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Shuo Yang
Le Hou
Xiaodan Song
Qiang Liu
Denny Zhou
253
9
0
08 Oct 2021
Convergence of gradient descent for learning linear neural networks
Advances in Continuous and Discrete Models (ACDM), 2021
Gabin Maxime Nguegnang
Holger Rauhut
Ulrich Terstiege
MLT
243
25
0
04 Aug 2021
Geometry of Linear Convolutional Networks
Kathlén Kohn
Thomas Merkh
Guido Montúfar
Matthew Trager
256
24
0
03 Aug 2021
The loss landscape of deep linear neural networks: a second-order analysis
El Mehdi Achour
Franccois Malgouyres
Sébastien Gerchinovitz
ODL
194
19
0
28 Jul 2021
Convergence rates for shallow neural networks learned by gradient descent
Alina Braun
Michael Kohler
S. Langer
Harro Walk
163
14
0
20 Jul 2021
Continuous vs. Discrete Optimization of Deep Neural Networks
Neural Information Processing Systems (NeurIPS), 2021
Omer Elkabetz
Nadav Cohen
202
46
0
14 Jul 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers
Gal Shachaf
Alon Brutzkus
Amir Globerson
183
17
0
04 Jul 2021
Analytic Insights into Structure and Rank of Neural Network Hessian Maps
Neural Information Processing Systems (NeurIPS), 2021
Sidak Pal Singh
Gregor Bachmann
Thomas Hofmann
FAtt
177
42
0
30 Jun 2021
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
Arthur Jacot
François Ged
Berfin cSimcsek
Clément Hongler
Franck Gabriel
284
63
0
30 Jun 2021
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction
Neural Information Processing Systems (NeurIPS), 2021
Dominik Stöger
Mahdi Soltanolkotabi
ODL
349
86
0
28 Jun 2021
Batch Normalization Orthogonalizes Representations in Deep Random Networks
Neural Information Processing Systems (NeurIPS), 2021
Hadi Daneshmand
Amir Joudaki
Francis R. Bach
OOD
112
38
0
07 Jun 2021
Towards Understanding Knowledge Distillation
International Conference on Machine Learning (ICML), 2019
Mary Phuong
Christoph H. Lampert
210
364
0
27 May 2021
Scaling Properties of Deep Residual Networks
International Conference on Machine Learning (ICML), 2021
A. Cohen
R. Cont
Alain Rossier
Renyuan Xu
154
19
0
25 May 2021
Convergence and Implicit Bias of Gradient Flow on Overparametrized Linear Networks
Hancheng Min
Salma Tarmoun
René Vidal
Enrique Mallada
MLT
150
5
0
13 May 2021
Principal Components Bias in Over-parameterized Linear Models, and its Manifestation in Deep Neural Networks
Journal of machine learning research (JMLR), 2021
Guy Hacohen
D. Weinshall
347
13
0
12 May 2021
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
International Conference on Machine Learning (ICML), 2021
Keyulu Xu
Mozhi Zhang
Stefanie Jegelka
Kenji Kawaguchi
GNN
186
84
0
10 May 2021
Noether: The More Things Change, the More Stay the Same
Grzegorz Gluch
R. Urbanke
135
20
0
12 Apr 2021
Neurons learn slower than they think
I. Kulikovskikh
98
0
0
02 Apr 2021
Student-Teacher Learning from Clean Inputs to Noisy Inputs
Computer Vision and Pattern Recognition (CVPR), 2021
Guanzhe Hong
Zhiyuan Mao
Xiaojun Lin
Stanley H. Chan
213
10
0
13 Mar 2021
A Mathematical Principle of Deep Learning: Learn the Geodesic Curve in the Wasserstein Space
Kuo Gai
Shihua Zhang
342
9
0
18 Feb 2021
On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers
International Conference on Learning Representations (ICLR), 2021
Kenji Kawaguchi
PINN
166
44
0
15 Feb 2021
Painless step size adaptation for SGD
I. Kulikovskikh
Tarzan Legović
133
0
0
01 Feb 2021
Activation Functions in Artificial Neural Networks: A Systematic Overview
Johannes Lederer
FAtt
AI4CE
115
55
0
25 Jan 2021
Non-Convex Compressed Sensing with Training Data
G. Welper
153
1
0
20 Jan 2021
Stable Recovery of Entangled Weights: Towards Robust Identification of Deep Neural Networks from Minimal Samples
Applied and Computational Harmonic Analysis (ACHA), 2021
Christian Fiedler
M. Fornasier
T. Klock
Michael Rauchensteiner
OOD
158
13
0
18 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
Rong Jin
201
3
0
12 Jan 2021
Recent Theoretical Advances in Non-Convex Optimization
Marina Danilova
Pavel Dvurechensky
Alexander Gasnikov
Eduard A. Gorbunov
Sergey Guminov
Dmitry Kamzolov
Innokentiy Shibaev
290
101
0
11 Dec 2020
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics
D. Kunin
Javier Sagastuy-Breña
Surya Ganguli
Daniel L. K. Yamins
Hidenori Tanaka
317
88
0
08 Dec 2020
Asymptotic convergence rate of Dropout on shallow linear neural networks
Measurement and Modeling of Computer Systems (SIGMETRICS), 2020
Albert Senen-Cerda
J. Sanders
188
8
0
01 Dec 2020
Deep orthogonal linear networks are shallow
Pierre Ablin
ODL
52
3
0
27 Nov 2020
Neural Network Training Techniques Regularize Optimization Trajectory: An Empirical Study
Cheng Chen
Junjie Yang
Yi Zhou
84
0
0
13 Nov 2020
Generalized Negative Correlation Learning for Deep Ensembling
Sebastian Buschjäger
Lukas Pfahler
K. Morik
FedML
BDL
UQCV
182
20
0
05 Nov 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
International Conference on Learning Representations (ICLR), 2020
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
397
88
0
06 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network
International Conference on Machine Learning (ICML), 2020
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
509
24
0
04 Oct 2020
A biologically plausible neural network for multi-channel Canonical Correlation Analysis
Neural Computation (Neural Comput.), 2020
David Lipshutz
Yanis Bahroun
Siavash Golkar
Anirvan M. Sengupta
Dmitri B. Chkovskii
285
24
0
01 Oct 2020
Deep matrix factorizations
Computer Science Review (CSR), 2020
Pierre De Handschutter
Nicolas Gillis
Xavier Siebert
BDL
372
54
0
01 Oct 2020
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
CSIAM Transactions on Applied Mathematics (CSIAM Trans. Appl. Math.), 2020
E. Weinan
Chao Ma
Stephan Wojtowytsch
Lei Wu
AI4CE
264
146
0
22 Sep 2020
A priori guarantees of finite-time convergence for Deep Neural Networks
Anushree Rankawat
M. Rankawat
Harshal B. Oza
134
0
0
16 Sep 2020
Previous
1
2
3
4
5
Next