ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.08121
  4. Cited By
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
v1v2v3 (latest)

Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations

12 March 2024
Akshay Kumar
Jarvis Haupt
    ODL
ArXiv (abs)PDFHTML

Papers citing "Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations"

36 / 36 papers shown
Title
From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
Zheng-an Chen
Tao Luo
AI4CE
128
1
0
08 Oct 2025
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
D. Kunin
Giovanni Luca Marchetti
F. Chen
Dhruva Karkada
James B. Simon
M. DeWeese
Surya Ganguli
Nina Miolane
385
4
0
06 Jun 2025
An overview of condensation phenomenon in deep learning
An overview of condensation phenomenon in deep learning
Zhi-Qin John Xu
Yaoyu Zhang
Zhangchen Zhou
AI4CE
214
11
0
13 Apr 2025
Towards Better Generalization: Weight Decay Induces Low-rank Bias for
  Neural Networks
Towards Better Generalization: Weight Decay Induces Low-rank Bias for Neural Networks
Ke Chen
Chugang Yi
Haizhao Yang
MLT
151
2
0
03 Oct 2024
Directional Convergence Near Small Initializations and Saddles in
  Two-Homogeneous Neural Networks
Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks
Akshay Kumar
Jarvis Haupt
ODL
185
10
0
14 Feb 2024
Early Neuron Alignment in Two-layer ReLU Networks with Small
  Initialization
Early Neuron Alignment in Two-layer ReLU Networks with Small InitializationInternational Conference on Learning Representations (ICLR), 2023
Hancheng Min
Enrique Mallada
René Vidal
MLT
242
28
0
24 Jul 2023
Understanding Multi-phase Optimization Dynamics and Rich Nonlinear
  Behaviors of ReLU Networks
Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU NetworksNeural Information Processing Systems (NeurIPS), 2023
Mingze Wang
Chao Ma
134
16
0
21 May 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained
  Analysis of Matrix Sensing
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix SensingInternational Conference on Machine Learning (ICML), 2023
Jikai Jin
Zhiyuan Li
Kaifeng Lyu
S. Du
Jason D. Lee
MLT
258
44
0
27 Jan 2023
Gradient flow dynamics of shallow ReLU networks for square loss and
  orthogonal inputs
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputsNeural Information Processing Systems (NeurIPS), 2022
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
260
73
0
02 Jun 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite
  Width
Empirical Phase Diagram for Three-layer Neural Networks with Infinite WidthNeural Information Processing Systems (NeurIPS), 2022
Hanxu Zhou
Qixuan Zhou
Zhenyuan Jin
Yaoyu Zhang
Yaoyu Zhang
Zhi-Qin John Xu
220
22
0
24 May 2022
Neural Networks as Kernel Learners: The Silent Alignment Effect
Neural Networks as Kernel Learners: The Silent Alignment EffectInternational Conference on Learning Representations (ICLR), 2021
Alexander B. Atanasov
Blake Bordelon
Cengiz Pehlevan
MLT
362
94
0
29 Oct 2021
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity
  Bias
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias
Kaifeng Lyu
Zhiyuan Li
Runzhe Wang
Sanjeev Arora
MLT
228
81
0
26 Oct 2021
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization
  Training, Symmetry, and Sparsity
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
Arthur Jacot
François Ged
Berfin cSimcsek
Clément Hongler
Franck Gabriel
312
65
0
30 Jun 2021
Small random initialization is akin to spectral learning: Optimization
  and generalization guarantees for overparameterized low-rank matrix
  reconstruction
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstructionNeural Information Processing Systems (NeurIPS), 2021
Dominik Stöger
Mahdi Soltanolkotabi
ODL
349
86
0
28 Jun 2021
Towards Resolving the Implicit Bias of Gradient Descent for Matrix
  Factorization: Greedy Low-Rank Learning
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank LearningInternational Conference on Learning Representations (ICLR), 2020
Zhiyuan Li
Yuping Luo
Kaifeng Lyu
201
143
0
17 Dec 2020
Phase diagram for two-layer ReLU neural networks at infinite-width limit
Phase diagram for two-layer ReLU neural networks at infinite-width limitJournal of machine learning research (JMLR), 2020
Yaoyu Zhang
Zhi-Qin John Xu
Zheng Ma
Yaoyu Zhang
198
71
0
15 Jul 2020
Directional convergence and alignment in deep learning
Directional convergence and alignment in deep learningNeural Information Processing Systems (NeurIPS), 2020
Ziwei Ji
Matus Telgarsky
228
198
0
11 Jun 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
  Trained with the Logistic Loss
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic LossAnnual Conference Computational Learning Theory (COLT), 2020
Lénaïc Chizat
Francis R. Bach
MLT
582
364
0
11 Feb 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning LibraryNeural Information Processing Systems (NeurIPS), 2019
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
956
48,276
0
03 Dec 2019
Conservative set valued fields, automatic differentiation, stochastic
  gradient method and deep learning
Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learningMathematical programming (Math. Program.), 2019
Jérôme Bolte
Edouard Pauwels
433
152
0
23 Sep 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Gradient Descent Maximizes the Margin of Homogeneous Neural NetworksInternational Conference on Learning Representations (ICLR), 2019
Kaifeng Lyu
Jian Li
482
363
0
13 Jun 2019
Kernel and Rich Regimes in Overparametrized ModelsAnnual Conference Computational Learning Theory (COLT), 2019
Blake E. Woodworth
Suriya Gunasekar
Pedro H. P. Savarese
E. Moroshko
Itay Golan
Jason D. Lee
Daniel Soudry
Nathan Srebro
323
390
0
13 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Implicit Regularization in Deep Matrix FactorizationNeural Information Processing Systems (NeurIPS), 2019
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
352
557
0
31 May 2019
Approximation spaces of deep neural networks
Approximation spaces of deep neural networksConstructive approximation (Constr. Approx.), 2019
Rémi Gribonval
Gitta Kutyniok
M. Nielsen
Felix Voigtländer
198
137
0
03 May 2019
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
597
985
0
26 Apr 2019
Mean-field theory of two-layers neural networks: dimension-free bounds
  and kernel limit
Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
303
300
0
16 Feb 2019
On Lazy Training in Differentiable Programming
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
514
903
0
19 Dec 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
1.8K
3,638
0
20 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
383
794
0
24 May 2018
Gradient Descent Quantizes ReLU Network Features
Gradient Descent Quantizes ReLU Network Features
Hartmut Maennel
Olivier Bousquet
Sylvain Gelly
MLT
134
88
0
22 Mar 2018
Theoretical insights into the optimization landscape of
  over-parameterized shallow neural networks
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks
Mahdi Soltanolkotabi
Adel Javanmard
Jason D. Lee
499
435
0
16 Jul 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Recovery Guarantees for One-hidden-layer Neural NetworksInternational Conference on Machine Learning (ICML), 2017
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
345
344
0
10 Jun 2017
Implicit Regularization in Matrix Factorization
Implicit Regularization in Matrix Factorization
Suriya Gunasekar
Blake E. Woodworth
Srinadh Bhojanapalli
Behnam Neyshabur
Nathan Srebro
258
527
0
25 May 2017
Approximation by Combinations of ReLU and Squared ReLU Ridge Functions
  with $ \ell^1 $ and $ \ell^0 $ Controls
Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with $ \ell^1 $ and $ \ell^0 $ Controls
Jason M. Klusowski
Andrew R. Barron
500
164
0
26 Jul 2016
On the Computational Efficiency of Training Neural Networks
On the Computational Efficiency of Training Neural NetworksNeural Information Processing Systems (NeurIPS), 2014
Roi Livni
Shai Shalev-Shwartz
Ohad Shamir
364
490
0
05 Oct 2014
Exact solutions to the nonlinear dynamics of learning in deep linear
  neural networks
Exact solutions to the nonlinear dynamics of learning in deep linear neural networksInternational Conference on Learning Representations (ICLR), 2013
Andrew M. Saxe
James L. McClelland
Surya Ganguli
ODL
1.0K
1,964
0
20 Dec 2013
1