ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02281
  4. Cited By
A Convergence Analysis of Gradient Descent for Deep Linear Neural
  Networks

A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks

4 October 2018
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
ArXivPDFHTML

Papers citing "A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks"

50 / 53 papers shown
Title
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks
Pierfrancesco Beneventano
Blake Woodworth
MLT
34
1
0
15 Jan 2025
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
30
3
0
22 Sep 2024
How Neural Networks Learn the Support is an Implicit Regularization
  Effect of SGD
How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD
Pierfrancesco Beneventano
Andrea Pinto
Tomaso A. Poggio
MLT
27
1
0
17 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning &
  Adaptation
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras
Peng Wang
Laura Balzano
Qing Qu
AI4CE
37
12
0
06 Jun 2024
Analysis of the rate of convergence of an over-parametrized
  convolutional neural network image classifier learned by gradient descent
Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent
Michael Kohler
A. Krzyżak
Benjamin Walter
26
1
0
13 May 2024
Analysis of the expected $L_2$ error of an over-parametrized deep neural
  network estimate learned by gradient descent without regularization
Analysis of the expected L2L_2L2​ error of an over-parametrized deep neural network estimate learned by gradient descent without regularization
Selina Drews
Michael Kohler
25
2
0
24 Nov 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small
  Initialization
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
32
19
0
24 Jul 2023
Unraveling Projection Heads in Contrastive Learning: Insights from
  Expansion and Shrinkage
Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage
Yu Gui
Cong Ma
Yiqiao Zhong
22
6
0
06 Jun 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow
  Solutions in Scalar Networks and Beyond
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Itai Kreisler
Mor Shpigel Nacson
Daniel Soudry
Y. Carmon
23
13
0
22 May 2023
Critical Points and Convergence Analysis of Generative Deep Linear
  Networks Trained with Bures-Wasserstein Loss
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
Pierre Bréchet
Katerina Papagiannouli
Jing An
Guido Montúfar
23
3
0
06 Mar 2023
Effects of Data Geometry in Early Deep Learning
Effects of Data Geometry in Early Deep Learning
Saket Tiwari
G. Konidaris
69
7
0
29 Dec 2022
Asymptotic Analysis of Deep Residual Networks
Asymptotic Analysis of Deep Residual Networks
R. Cont
Alain Rossier
Renyuan Xu
19
4
0
15 Dec 2022
Symmetries, flat minima, and the conserved quantities of gradient flow
Symmetries, flat minima, and the conserved quantities of gradient flow
Bo-Lu Zhao
I. Ganev
Robin G. Walters
Rose Yu
Nima Dehmamy
44
16
0
31 Oct 2022
Deep Linear Networks for Matrix Completion -- An Infinite Depth Limit
Deep Linear Networks for Matrix Completion -- An Infinite Depth Limit
Nadav Cohen
Govind Menon
Zsolt Veraszto
ODL
21
7
0
22 Oct 2022
TiDAL: Learning Training Dynamics for Active Learning
TiDAL: Learning Training Dynamics for Active Learning
Seong Min Kye
Kwanghee Choi
Hyeongmin Byun
Buru Chang
26
13
0
13 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis
  Function Decomposition
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
38
4
0
01 Oct 2022
Implicit Full Waveform Inversion with Deep Neural Representation
Implicit Full Waveform Inversion with Deep Neural Representation
Jian-jun Sun
K. Innanen
AI4CE
32
32
0
08 Sep 2022
Intersection of Parallels as an Early Stopping Criterion
Intersection of Parallels as an Early Stopping Criterion
Ali Vardasbi
Maarten de Rijke
Mostafa Dehghani
MoMe
33
5
0
19 Aug 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Andrew M. Saxe
Shagun Sodhani
Sam Lewallen
AI4CE
28
34
0
21 Jul 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
21
71
0
08 Jun 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide
  Neural Networks
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks
Blake Bordelon
C. Pehlevan
MLT
24
79
0
19 May 2022
Concept Evolution in Deep Learning Training: A Unified Interpretation
  Framework and Discoveries
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries
Haekyu Park
Seongmin Lee
Benjamin Hoover
Austin P. Wright
Omar Shaikh
Rahul Duggal
Nilaksh Das
Kevin Li
Judy Hoffman
Duen Horng Chau
19
2
0
30 Mar 2022
Convergence of gradient descent for deep neural networks
Convergence of gradient descent for deep neural networks
S. Chatterjee
ODL
19
20
0
30 Mar 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic
  Gradient Descent for Shallow Neural Networks
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
30
3
0
28 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning
  Optimization Landscape
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Over-Parametrized Matrix Factorization in the Presence of Spurious
  Stationary Points
Over-Parametrized Matrix Factorization in the Presence of Spurious Stationary Points
Armin Eftekhari
19
1
0
25 Dec 2021
Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural
  Networks
Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks
P. Esser
L. C. Vankadara
D. Ghoshdastidar
28
53
0
07 Dec 2021
PAC-Bayesian Learning of Aggregated Binary Activated Neural Networks
  with Probabilities over Representations
PAC-Bayesian Learning of Aggregated Binary Activated Neural Networks with Probabilities over Representations
Louis Fortier-Dubois
Gaël Letarte
Benjamin Leblanc
Franccois Laviolette
Pascal Germain
UQCV
14
0
0
28 Oct 2021
A global convergence theory for deep ReLU implicit networks via
  over-parameterization
A global convergence theory for deep ReLU implicit networks via over-parameterization
Tianxiang Gao
Hailiang Liu
Jia Liu
Hridesh Rajan
Hongyang Gao
MLT
23
16
0
11 Oct 2021
Imitating Deep Learning Dynamics via Locally Elastic Stochastic
  Differential Equations
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Jiayao Zhang
Hua Wang
Weijie J. Su
29
7
0
11 Oct 2021
Towards Demystifying Representation Learning with Non-contrastive
  Self-supervision
Towards Demystifying Representation Learning with Non-contrastive Self-supervision
Xiang Wang
Xinlei Chen
S. Du
Yuandong Tian
SSL
16
26
0
11 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Shuo Yang
Le Hou
Xiaodan Song
Qiang Liu
Denny Zhou
110
9
0
08 Oct 2021
The loss landscape of deep linear neural networks: a second-order
  analysis
The loss landscape of deep linear neural networks: a second-order analysis
E. M. Achour
Franccois Malgouyres
Sébastien Gerchinovitz
ODL
22
9
0
28 Jul 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers
A Theoretical Analysis of Fine-tuning with Linear Teachers
Gal Shachaf
Alon Brutzkus
Amir Globerson
26
17
0
04 Jul 2021
Small random initialization is akin to spectral learning: Optimization
  and generalization guarantees for overparameterized low-rank matrix
  reconstruction
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction
Dominik Stöger
Mahdi Soltanolkotabi
ODL
31
74
0
28 Jun 2021
Scaling Properties of Deep Residual Networks
Scaling Properties of Deep Residual Networks
A. Cohen
R. Cont
Alain Rossier
Renyuan Xu
17
18
0
25 May 2021
Deep matrix factorizations
Deep matrix factorizations
Pierre De Handschutter
Nicolas Gillis
Xavier Siebert
BDL
28
40
0
01 Oct 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network
  Training
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
16
158
0
07 Sep 2020
Deep Polynomial Neural Networks
Deep Polynomial Neural Networks
Grigorios G. Chrysos
Stylianos Moschoglou
Giorgos Bouritsas
Jiankang Deng
Yannis Panagakis
S. Zafeiriou
21
92
0
20 Jun 2020
Directional Pruning of Deep Neural Networks
Directional Pruning of Deep Neural Networks
Shih-Kang Chao
Zhanyu Wang
Yue Xing
Guang Cheng
ODL
8
33
0
16 Jun 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Noam Razin
Nadav Cohen
16
155
0
13 May 2020
Monotonic Value Function Factorisation for Deep Multi-Agent
  Reinforcement Learning
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid
Mikayel Samvelyan
Christian Schroeder de Witt
Gregory Farquhar
Jakob N. Foerster
Shimon Whiteson
47
767
0
19 Mar 2020
The Break-Even Point on Optimization Trajectories of Deep Neural
  Networks
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
40
154
0
21 Feb 2020
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Lei Wu
Qingcan Wang
Chao Ma
ODL
AI4CE
20
22
0
02 Nov 2019
Neural Similarity Learning
Neural Similarity Learning
Weiyang Liu
Zhen Liu
James M. Rehg
Le Song
18
29
0
28 Oct 2019
Overparameterized Neural Networks Implement Associative Memory
Overparameterized Neural Networks Implement Associative Memory
Adityanarayanan Radhakrishnan
M. Belkin
Caroline Uhler
BDL
19
71
0
26 Sep 2019
Implicit Regularization in Deep Matrix Factorization
Implicit Regularization in Deep Matrix Factorization
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
24
491
0
31 May 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network
  Model with Skip-connections
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections
E. Weinan
Chao Ma
Qingcan Wang
Lei Wu
MLT
24
22
0
10 Apr 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
13
93
0
24 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU
  Networks
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
13
446
0
21 Nov 2018
12
Next