Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.02040
Cited By
Convergence of gradient descent for learning linear neural networks
4 August 2021
Gabin Maxime Nguegnang
Holger Rauhut
Ulrich Terstiege
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Convergence of gradient descent for learning linear neural networks"
14 / 14 papers shown
Title
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks
Pierfrancesco Beneventano
Blake Woodworth
MLT
34
1
0
15 Jan 2025
Convergence of continuous-time stochastic gradient descent with applications to linear deep neural networks
Gabor Lugosi
Eulalia Nualart
13
0
0
11 Sep 2024
Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning
Nadav Cohen
Noam Razin
31
0
0
25 Aug 2024
How do Transformers perform In-Context Autoregressive Learning?
Michael E. Sander
Raja Giryes
Taiji Suzuki
Mathieu Blondel
Gabriel Peyré
32
7
0
08 Feb 2024
Geometry of Linear Neural Networks: Equivariance and Invariance under Permutation Groups
Kathlén Kohn
Anna-Laura Sattelberger
V. Shahverdi
25
3
0
24 Sep 2023
Asymmetric matrix sensing by gradient descent with small random initialization
J. S. Wind
38
1
0
04 Sep 2023
Robust Implicit Regularization via Weight Normalization
H. Chou
Holger Rauhut
Rachel A. Ward
28
7
0
09 May 2023
Function Space and Critical Points of Linear Convolutional Networks
Kathlén Kohn
Guido Montúfar
V. Shahverdi
Matthew Trager
16
11
0
12 Apr 2023
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
Pierre Bréchet
Katerina Papagiannouli
Jing An
Guido Montúfar
23
3
0
06 Mar 2023
Side Effects of Learning from Low-dimensional Data Embedded in a Euclidean Space
Juncai He
R. Tsai
Rachel A. Ward
36
8
0
01 Mar 2022
Continuous vs. Discrete Optimization of Deep Neural Networks
Omer Elkabetz
Nadav Cohen
62
44
0
14 Jul 2021
Global optimality conditions for deep neural networks
Chulhee Yun
S. Sra
Ali Jadbabaie
121
117
0
08 Jul 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,888
0
15 Sep 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
133
1,198
0
16 Aug 2016
1