Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.02281
Cited By
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
4 October 2018
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks"
50 / 53 papers shown
Title
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks
Pierfrancesco Beneventano
Blake Woodworth
MLT
34
1
0
15 Jan 2025
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
30
3
0
22 Sep 2024
How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD
Pierfrancesco Beneventano
Andrea Pinto
Tomaso A. Poggio
MLT
27
1
0
17 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras
Peng Wang
Laura Balzano
Qing Qu
AI4CE
37
12
0
06 Jun 2024
Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent
Michael Kohler
A. Krzyżak
Benjamin Walter
26
1
0
13 May 2024
Analysis of the expected
L
2
L_2
L
2
error of an over-parametrized deep neural network estimate learned by gradient descent without regularization
Selina Drews
Michael Kohler
25
2
0
24 Nov 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
32
19
0
24 Jul 2023
Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage
Yu Gui
Cong Ma
Yiqiao Zhong
22
6
0
06 Jun 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Itai Kreisler
Mor Shpigel Nacson
Daniel Soudry
Y. Carmon
23
13
0
22 May 2023
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
Pierre Bréchet
Katerina Papagiannouli
Jing An
Guido Montúfar
23
3
0
06 Mar 2023
Effects of Data Geometry in Early Deep Learning
Saket Tiwari
G. Konidaris
69
7
0
29 Dec 2022
Asymptotic Analysis of Deep Residual Networks
R. Cont
Alain Rossier
Renyuan Xu
19
4
0
15 Dec 2022
Symmetries, flat minima, and the conserved quantities of gradient flow
Bo-Lu Zhao
I. Ganev
Robin G. Walters
Rose Yu
Nima Dehmamy
44
16
0
31 Oct 2022
Deep Linear Networks for Matrix Completion -- An Infinite Depth Limit
Nadav Cohen
Govind Menon
Zsolt Veraszto
ODL
21
7
0
22 Oct 2022
TiDAL: Learning Training Dynamics for Active Learning
Seong Min Kye
Kwanghee Choi
Hyeongmin Byun
Buru Chang
26
13
0
13 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
38
4
0
01 Oct 2022
Implicit Full Waveform Inversion with Deep Neural Representation
Jian-jun Sun
K. Innanen
AI4CE
32
32
0
08 Sep 2022
Intersection of Parallels as an Early Stopping Criterion
Ali Vardasbi
Maarten de Rijke
Mostafa Dehghani
MoMe
33
5
0
19 Aug 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Andrew M. Saxe
Shagun Sodhani
Sam Lewallen
AI4CE
28
34
0
21 Jul 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
21
71
0
08 Jun 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks
Blake Bordelon
C. Pehlevan
MLT
24
79
0
19 May 2022
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries
Haekyu Park
Seongmin Lee
Benjamin Hoover
Austin P. Wright
Omar Shaikh
Rahul Duggal
Nilaksh Das
Kevin Li
Judy Hoffman
Duen Horng Chau
19
2
0
30 Mar 2022
Convergence of gradient descent for deep neural networks
S. Chatterjee
ODL
19
20
0
30 Mar 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
30
3
0
28 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Over-Parametrized Matrix Factorization in the Presence of Spurious Stationary Points
Armin Eftekhari
19
1
0
25 Dec 2021
Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks
P. Esser
L. C. Vankadara
D. Ghoshdastidar
28
53
0
07 Dec 2021
PAC-Bayesian Learning of Aggregated Binary Activated Neural Networks with Probabilities over Representations
Louis Fortier-Dubois
Gaël Letarte
Benjamin Leblanc
Franccois Laviolette
Pascal Germain
UQCV
14
0
0
28 Oct 2021
A global convergence theory for deep ReLU implicit networks via over-parameterization
Tianxiang Gao
Hailiang Liu
Jia Liu
Hridesh Rajan
Hongyang Gao
MLT
23
16
0
11 Oct 2021
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Jiayao Zhang
Hua Wang
Weijie J. Su
29
7
0
11 Oct 2021
Towards Demystifying Representation Learning with Non-contrastive Self-supervision
Xiang Wang
Xinlei Chen
S. Du
Yuandong Tian
SSL
16
26
0
11 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Shuo Yang
Le Hou
Xiaodan Song
Qiang Liu
Denny Zhou
110
9
0
08 Oct 2021
The loss landscape of deep linear neural networks: a second-order analysis
E. M. Achour
Franccois Malgouyres
Sébastien Gerchinovitz
ODL
22
9
0
28 Jul 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers
Gal Shachaf
Alon Brutzkus
Amir Globerson
26
17
0
04 Jul 2021
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction
Dominik Stöger
Mahdi Soltanolkotabi
ODL
31
74
0
28 Jun 2021
Scaling Properties of Deep Residual Networks
A. Cohen
R. Cont
Alain Rossier
Renyuan Xu
17
18
0
25 May 2021
Deep matrix factorizations
Pierre De Handschutter
Nicolas Gillis
Xavier Siebert
BDL
28
40
0
01 Oct 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
16
158
0
07 Sep 2020
Deep Polynomial Neural Networks
Grigorios G. Chrysos
Stylianos Moschoglou
Giorgos Bouritsas
Jiankang Deng
Yannis Panagakis
S. Zafeiriou
21
92
0
20 Jun 2020
Directional Pruning of Deep Neural Networks
Shih-Kang Chao
Zhanyu Wang
Yue Xing
Guang Cheng
ODL
8
33
0
16 Jun 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Noam Razin
Nadav Cohen
16
155
0
13 May 2020
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid
Mikayel Samvelyan
Christian Schroeder de Witt
Gregory Farquhar
Jakob N. Foerster
Shimon Whiteson
47
767
0
19 Mar 2020
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
40
154
0
21 Feb 2020
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Lei Wu
Qingcan Wang
Chao Ma
ODL
AI4CE
20
22
0
02 Nov 2019
Neural Similarity Learning
Weiyang Liu
Zhen Liu
James M. Rehg
Le Song
18
29
0
28 Oct 2019
Overparameterized Neural Networks Implement Associative Memory
Adityanarayanan Radhakrishnan
M. Belkin
Caroline Uhler
BDL
19
71
0
26 Sep 2019
Implicit Regularization in Deep Matrix Factorization
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
24
491
0
31 May 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections
E. Weinan
Chao Ma
Qingcan Wang
Lei Wu
MLT
24
22
0
10 Apr 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
13
93
0
24 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
13
446
0
21 Nov 2018
1
2
Next