ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.00900
  4. Cited By
Algorithmic Regularization in Learning Deep Homogeneous Models: Layers
  are Automatically Balanced
v1v2 (latest)

Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

4 June 2018
S. Du
Wei Hu
Jason D. Lee
    MLT
ArXiv (abs)PDFHTML

Papers citing "Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced"

50 / 124 papers shown
Title
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Magnitude and Angle Dynamics in Training Single ReLU NeuronsNeural Networks (NN), 2022
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
343
6
0
27 Sep 2022
A Validation Approach to Over-parameterized Matrix and Image Recovery
A Validation Approach to Over-parameterized Matrix and Image Recovery
Lijun Ding
Zhen Qin
Liwei Jiang
Jinxin Zhou
Zhihui Zhu
385
15
0
21 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the
  ugly (initialization)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)Neural Information Processing Systems (NeurIPS), 2022
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
326
23
0
15 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
On the Implicit Bias in Deep-Learning AlgorithmsCommunications of the ACM (CACM), 2022
Gal Vardi
FedMLAI4CE
356
107
0
26 Aug 2022
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep
  Models
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Xingyu Xie
Pan Zhou
Huan Li
Zhouchen Lin
Shuicheng Yan
ODL
417
240
0
13 Aug 2022
Implicit Regularization with Polynomial Growth in Deep Tensor
  Factorization
Implicit Regularization with Polynomial Growth in Deep Tensor FactorizationInternational Conference on Machine Learning (ICML), 2022
Kais Hariz
Hachem Kadri
Stéphane Ayache
Maher Moakher
Thierry Artières
140
4
0
18 Jul 2022
Symmetry Teleportation for Accelerated Optimization
Symmetry Teleportation for Accelerated OptimizationNeural Information Processing Systems (NeurIPS), 2022
B. Zhao
Nima Dehmamy
Robin Walters
Rose Yu
ODL
409
28
0
21 May 2022
Flat minima generalize for low-rank matrix recovery
Flat minima generalize for low-rank matrix recoveryInformation and Inference A Journal of the IMA (JIII), 2022
Lijun Ding
Dmitriy Drusvyatskiy
Maryam Fazel
Zaid Harchaoui
188
29
0
07 Mar 2022
Algorithmic Regularization in Model-free Overparametrized Asymmetric
  Matrix Factorization
Algorithmic Regularization in Model-free Overparametrized Asymmetric Matrix FactorizationSIAM Journal on Mathematics of Data Science (SIMODS), 2022
Liwei Jiang
Yudong Chen
Lijun Ding
201
32
0
06 Mar 2022
Fine-Tuning can Distort Pretrained Features and Underperform
  Out-of-Distribution
Fine-Tuning can Distort Pretrained Features and Underperform Out-of-DistributionInternational Conference on Learning Representations (ICLR), 2022
Ananya Kumar
Aditi Raghunathan
Robbie Jones
Tengyu Ma
Abigail Z. Jacobs
OODD
304
815
0
21 Feb 2022
On Optimal Early Stopping: Over-informative versus Under-informative
  Parametrization
On Optimal Early Stopping: Over-informative versus Under-informative Parametrization
Ruoqi Shen
Liyao (Mars) Gao
Yi-An Ma
238
16
0
20 Feb 2022
Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks
Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
179
0
0
11 Feb 2022
Implicit Regularization Towards Rank Minimization in ReLU Networks
Implicit Regularization Towards Rank Minimization in ReLU NetworksInternational Conference on Algorithmic Learning Theory (ALT), 2022
Nadav Timor
Gal Vardi
Ohad Shamir
213
62
0
30 Jan 2022
Understanding Deep Contrastive Learning via Coordinate-wise Optimization
Understanding Deep Contrastive Learning via Coordinate-wise OptimizationNeural Information Processing Systems (NeurIPS), 2022
Yuandong Tian
461
41
0
29 Jan 2022
Training invariances and the low-rank phenomenon: beyond linear networks
Training invariances and the low-rank phenomenon: beyond linear networksInternational Conference on Learning Representations (ICLR), 2022
Thien Le
Stefanie Jegelka
237
36
0
28 Jan 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep
  Convolutional Neural Networks
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural NetworksInternational Conference on Machine Learning (ICML), 2022
Noam Razin
Asaf Maman
Nadav Cohen
397
33
0
27 Jan 2022
How and When Random Feedback Works: A Case Study of Low-Rank Matrix
  Factorization
How and When Random Feedback Works: A Case Study of Low-Rank Matrix Factorization
Shivam Garg
Santosh Vempala
296
3
0
17 Nov 2021
Regularization by Misclassification in ReLU Neural Networks
Regularization by Misclassification in ReLU Neural Networks
Elisabetta Cornacchia
Jan Hązła
Ido Nachum
Amir Yehudayoff
NoLa
156
2
0
03 Nov 2021
Neural Networks as Kernel Learners: The Silent Alignment Effect
Neural Networks as Kernel Learners: The Silent Alignment EffectInternational Conference on Learning Representations (ICLR), 2021
Alexander B. Atanasov
Blake Bordelon
Cengiz Pehlevan
MLT
362
94
0
29 Oct 2021
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity
  Bias
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias
Kaifeng Lyu
Zhiyuan Li
Runzhe Wang
Sanjeev Arora
MLT
244
81
0
26 Oct 2021
On the Regularization of Autoencoders
On the Regularization of Autoencoders
Harald Steck
Dario Garcia-Garcia
SSLAI4CE
191
4
0
21 Oct 2021
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Yuqing Wang
Minshuo Chen
T. Zhao
Molei Tao
AI4CE
281
49
0
07 Oct 2021
On Margin Maximization in Linear and ReLU Networks
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
236
32
0
06 Oct 2021
Convergence of gradient descent for learning linear neural networks
Convergence of gradient descent for learning linear neural networksAdvances in Continuous and Discrete Models (ACDM), 2021
Gabin Maxime Nguegnang
Holger Rauhut
Ulrich Terstiege
MLT
263
25
0
04 Aug 2021
Nonconvex Factorization and Manifold Formulations are Almost Equivalent
  in Low-rank Matrix Optimization
Nonconvex Factorization and Manifold Formulations are Almost Equivalent in Low-rank Matrix Optimization
Yuetian Luo
Xudong Li
Anru R. Zhang
266
11
0
03 Aug 2021
Convergence analysis for gradient flows in the training of artificial
  neural networks with ReLU activation
Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activationJournal of Mathematical Analysis and Applications (JMAA), 2021
Arnulf Jentzen
Adrian Riekert
184
26
0
09 Jul 2021
A Mechanism for Producing Aligned Latent Spaces with Autoencoders
A Mechanism for Producing Aligned Latent Spaces with Autoencoders
Saachi Jain
Adityanarayanan Radhakrishnan
Caroline Uhler
165
11
0
29 Jun 2021
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix
  Factorization
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix FactorizationNeural Information Processing Systems (NeurIPS), 2021
Tian-Chun Ye
S. Du
165
56
0
27 Jun 2021
Principal Components Bias in Over-parameterized Linear Models, and its
  Manifestation in Deep Neural Networks
Principal Components Bias in Over-parameterized Linear Models, and its Manifestation in Deep Neural NetworksJournal of machine learning research (JMLR), 2021
Guy Hacohen
D. Weinshall
407
13
0
12 May 2021
Noether's Learning Dynamics: Role of Symmetry Breaking in Neural
  Networks
Noether's Learning Dynamics: Role of Symmetry Breaking in Neural NetworksNeural Information Processing Systems (NeurIPS), 2021
Hidenori Tanaka
D. Kunin
337
39
0
06 May 2021
Noether: The More Things Change, the More Stay the Same
Noether: The More Things Change, the More Stay the Same
Grzegorz Gluch
R. Urbanke
155
21
0
12 Apr 2021
Approximating How Single Head Attention Learns
Approximating How Single Head Attention Learns
Charles Burton Snell
Ruiqi Zhong
Dan Klein
Jacob Steinhardt
MLT
168
33
0
13 Mar 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix
  Factorization
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix FactorizationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Tianyi Liu
Yan Li
S. Wei
Enlu Zhou
T. Zhao
153
18
0
24 Feb 2021
Implicit Regularization in Tensor Factorization
Implicit Regularization in Tensor FactorizationInternational Conference on Machine Learning (ICML), 2021
Noam Razin
Asaf Maman
Nadav Cohen
239
55
0
19 Feb 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal
  Mirror Descent
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror DescentInternational Conference on Machine Learning (ICML), 2021
Shahar Azulay
E. Moroshko
Mor Shpigel Nacson
Blake E. Woodworth
Nathan Srebro
Amir Globerson
Daniel Soudry
AI4CE
251
78
0
19 Feb 2021
Understanding self-supervised Learning Dynamics without Contrastive
  Pairs
Understanding self-supervised Learning Dynamics without Contrastive PairsInternational Conference on Machine Learning (ICML), 2021
Yuandong Tian
Xinlei Chen
Surya Ganguli
SSL
559
317
0
12 Feb 2021
Implicit Regularization in ReLU Networks with the Square Loss
Implicit Regularization in ReLU Networks with the Square LossAnnual Conference Computational Learning Theory (COLT), 2020
Gal Vardi
Ohad Shamir
221
53
0
09 Dec 2020
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning
  Dynamics
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics
D. Kunin
Javier Sagastuy-Breña
Surya Ganguli
Daniel L. K. Yamins
Hidenori Tanaka
341
89
0
08 Dec 2020
Gradient Descent for Deep Matrix Factorization: Dynamics and Implicit
  Bias towards Low Rank
Gradient Descent for Deep Matrix Factorization: Dynamics and Implicit Bias towards Low RankSocial Science Research Network (SSRN), 2020
H. Chou
Carsten Gieshoff
J. Maly
Holger Rauhut
437
46
0
27 Nov 2020
Neural collapse with unconstrained features
Neural collapse with unconstrained featuresSampling Theory, Signal Processing, and Data Analysis (TSPDA), 2020
D. Mixon
Hans Parshall
Jianzong Pi
220
137
0
23 Nov 2020
Gradient Starvation: A Learning Proclivity in Neural Networks
Gradient Starvation: A Learning Proclivity in Neural NetworksNeural Information Processing Systems (NeurIPS), 2020
Mohammad Pezeshki
Sekouba Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
MLT
486
305
0
18 Nov 2020
Adaptive Signal Variances: CNN Initialization Through Modern
  Architectures
Adaptive Signal Variances: CNN Initialization Through Modern Architectures
Takahiko Henmi
E. R. R. Zara
Yoshihiro Hirohashi
Tsuyoshi Kato
182
3
0
16 Aug 2020
Understanding Implicit Regularization in Over-Parameterized Single Index
  Model
Understanding Implicit Regularization in Over-Parameterized Single Index ModelJournal of the American Statistical Association (JASA), 2020
Jianqing Fan
Zhuoran Yang
Mengxin Yu
285
22
0
16 Jul 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
Jason D. Lee
Tengyu Ma
603
108
0
15 Jun 2020
Pruning neural networks without any data by iteratively conserving
  synaptic flow
Pruning neural networks without any data by iteratively conserving synaptic flow
Hidenori Tanaka
D. Kunin
Daniel L. K. Yamins
Surya Ganguli
526
749
0
09 Jun 2020
Statistical Guarantees for Regularized Neural Networks
Statistical Guarantees for Regularized Neural NetworksNeural Networks (NN), 2020
Mahsa Taheri
Fang Xie
Johannes Lederer
267
41
0
30 May 2020
Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled
  Gradient Descent
Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent
Tian Tong
Cong Ma
Yuejie Chi
504
131
0
18 May 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Noam Razin
Nadav Cohen
282
163
0
13 May 2020
Robust and On-the-fly Dataset Denoising for Image Classification
Robust and On-the-fly Dataset Denoising for Image ClassificationEuropean Conference on Computer Vision (ECCV), 2020
Jiaming Song
Lunjia Hu
Michael Auli
Yann N. Dauphin
Tengyu Ma
NoLaOOD
158
13
0
24 Mar 2020
On Alignment in Deep Linear Neural Networks
On Alignment in Deep Linear Neural Networks
Adityanarayanan Radhakrishnan
Eshaan Nichani
D. Bernstein
Caroline Uhler
129
2
0
13 Mar 2020
Previous
123
Next