Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2005.06398
Cited By
v1
v2 (latest)
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
13 May 2020
Noam Razin
Nadav Cohen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Implicit Regularization in Deep Learning May Not Be Explainable by Norms"
50 / 112 papers shown
Title
Why is Your Language Model a Poor Implicit Reward Model?
Noam Razin
Yong Lin
Jiarui Yao
Sanjeev Arora
LRM
83
0
0
10 Jul 2025
Flatness After All?
N. Shoham
Liron Mor Yosef
H. Avron
59
0
0
21 Jun 2025
Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
Yixiao Huang
Hanlin Zhu
Tianyu Guo
Jiantao Jiao
Somayeh Sojoudi
Michael I. Jordan
Stuart Russell
Song Mei
LRM
334
4
0
12 Jun 2025
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
225
0
0
11 Apr 2025
Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture
Yikun Hou
Suvrit Sra
A. Yurtsever
163
0
0
27 Jan 2025
Weight decay induces low-rank attention layers
Neural Information Processing Systems (NeurIPS), 2024
Seijin Kobayashi
Yassir Akram
J. Oswald
171
20
0
31 Oct 2024
Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization
Yang Chen
Long Yang
Yitao Liang
Zhouchen Lin
194
2
0
11 Oct 2024
Tailed Low-Rank Matrix Factorization for Similarity Matrix Completion
Changyi Ma
Runsheng Yu
Xiao Chen
Youzhi Zhang
135
0
0
29 Sep 2024
Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning
Nadav Cohen
Noam Razin
178
2
0
25 Aug 2024
Approaching Deep Learning through the Spectral Dynamics of Weights
David Yunis
Kumar Kshitij Patel
Samuel Wheeler
Pedro H. P. Savarese
Gal Vardi
Karen Livescu
Michael Maire
Matthew R. Walter
185
12
0
21 Aug 2024
The Implicit Bias of Adam on Separable Data
Neural Information Processing Systems (NeurIPS), 2024
Chenyang Zhang
Difan Zou
Yuan Cao
AI4CE
178
14
0
15 Jun 2024
Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion
Zhiwei Bai
Jiajie Zhao
Yaoyu Zhang
AI4CE
165
2
0
22 May 2024
On Uncertainty Quantification for Near-Bayes Optimal Algorithms
Ziyu Wang
Chris Holmes
UQCV
187
3
0
28 Mar 2024
Improving Implicit Regularization of SGD with Preconditioning for Least Square Problems
Junwei Su
Difan Zou
Chuan Wu
248
0
0
13 Mar 2024
Implicit Regularization via Spectral Neural Networks and Non-linear Matrix Sensing
Hong T.M. Chu
Subhro Ghosh
Chi Thanh Lam
Soumendu Sundar Mukherjee
92
1
0
27 Feb 2024
On the Role of Initialization on the Implicit Bias in Deep Linear Networks
Oria Gruber
H. Avron
AI4CE
90
1
0
04 Feb 2024
Linear Recursive Feature Machines provably recover low-rank matrices
Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2024
Adityanarayanan Radhakrishnan
Misha Belkin
Dmitriy Drusvyatskiy
184
12
0
09 Jan 2024
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models
Tolga Ergen
Mert Pilanci
100
4
0
19 Dec 2023
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
International Conference on Learning Representations (ICLR), 2023
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
145
50
0
30 Nov 2023
In Search of a Data Transformation That Accelerates Neural Field Training
Computer Vision and Pattern Recognition (CVPR), 2023
Junwon Seo
Sangyoon Lee
Kwang In Kim
Jaeho Lee
211
6
0
28 Nov 2023
Vanishing Gradients in Reinforcement Finetuning of Language Models
International Conference on Learning Representations (ICLR), 2023
Noam Razin
Hattie Zhou
Omid Saremi
Vimal Thilak
Arwen Bradley
Preetum Nakkiran
Josh Susskind
Etai Littwin
194
16
0
31 Oct 2023
A Quadratic Synchronization Rule for Distributed Deep Learning
International Conference on Learning Representations (ICLR), 2023
Xinran Gu
Kaifeng Lyu
Sanjeev Arora
Jingzhao Zhang
Longbo Huang
179
3
0
22 Oct 2023
Training Dynamics of Deep Network Linear Regions
Ahmed Imtiaz Humayun
Randall Balestriero
Richard Baraniuk
145
4
0
19 Oct 2023
Are GATs Out of Balance?
Neural Information Processing Systems (NeurIPS), 2023
Nimrah Mustafa
Aleksandar Bojchevski
R. Burkholz
214
8
0
11 Oct 2023
Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks
J. S. Wind
Vegard Antun
A. Hansen
139
5
0
13 Jul 2023
Implicit regularisation in stochastic gradient descent: from single-objective to two-player games
Mihaela Rosca
M. Deisenroth
96
2
0
11 Jul 2023
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks
International Conference on Learning Representations (ICLR), 2023
Mor Shpigel Nacson
Rotem Mulayoff
Greg Ongie
T. Michaeli
Daniel Soudry
138
16
0
30 Jun 2023
Maintaining Plasticity in Deep Continual Learning
Shibhansh Dohare
J. F. Hernandez-Garcia
Parash Rahman
A. Rupam Mahmood
Richard S. Sutton
KELM
CLL
209
34
0
23 Jun 2023
The Inductive Bias of Flatness Regularization for Deep Matrix Factorization
Khashayar Gatmiry
Zhiyuan Li
Ching-Yao Chuang
Sashank J. Reddi
Tengyu Ma
Stefanie Jegelka
ODL
128
13
0
22 Jun 2023
Exact Count of Boundary Pieces of ReLU Classifiers: Towards the Proper Complexity Measure for Classification
Conference on Uncertainty in Artificial Intelligence (UAI), 2023
Paweł Piwek
Adam Klukowski
Tianyang Hu
100
5
0
15 Jun 2023
Transformers learn through gradual rank increase
Neural Information Processing Systems (NeurIPS), 2023
Enric Boix-Adserà
Etai Littwin
Emmanuel Abbe
Samy Bengio
J. Susskind
188
42
0
12 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs
Neural Information Processing Systems (NeurIPS), 2023
D. Chistikov
Matthias Englert
R. Lazic
MLT
164
14
0
10 Jun 2023
Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
International Conference on Machine Learning (ICML), 2023
Yihao Xue
S. Joshi
Eric Gan
Pin-Yu Chen
Baharan Mirzasoleiman
SSL
205
32
0
25 May 2023
Implicit bias of SGD in
L
2
L_{2}
L
2
-regularized linear DNNs: One-way jumps from high to low rank
International Conference on Learning Representations (ICLR), 2023
Zihan Wang
Arthur Jacot
156
23
0
25 May 2023
ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi-Index Models
SIAM Journal on Mathematics of Data Science (SIMODS), 2023
Suzanna Parkinson
Greg Ongie
Rebecca Willett
282
7
0
24 May 2023
Exploring the Complexity of Deep Neural Networks through Functional Equivalence
International Conference on Machine Learning (ICML), 2023
Guohao Shen
198
6
0
19 May 2023
Robust Implicit Regularization via Weight Normalization
Information and Inference A Journal of the IMA (JIII), 2023
H. Chou
Holger Rauhut
Rachel A. Ward
197
8
0
09 May 2023
Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems
Kevin Zeng
Carlos E. Pérez De Jesús
Andrew J Fox
M. Graham
AI4CE
202
22
0
01 May 2023
On the Effect of Initialization: The Scaling Path of 2-Layer Neural Networks
Journal of machine learning research (JMLR), 2023
Sebastian Neumayer
Lénaïc Chizat
M. Unser
147
2
0
31 Mar 2023
What Makes Data Suitable for a Locally Connected Neural Network? A Necessary and Sufficient Condition Based on Quantum Entanglement
Neural Information Processing Systems (NeurIPS), 2023
Yotam Alexander
Nimrod De La Vega
Noam Razin
Nadav Cohen
213
6
0
20 Mar 2023
First-order ANIL learns linear representations despite misspecified latent dimension
Oğuz Kaan Yüksel
Etienne Boursier
Nicolas Flammarion
162
1
0
02 Mar 2023
Penalising the biases in norm regularisation enforces sparsity
Neural Information Processing Systems (NeurIPS), 2023
Etienne Boursier
Nicolas Flammarion
264
18
0
02 Mar 2023
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
International Conference on Learning Representations (ICLR), 2023
Avrajit Ghosh
He Lyu
Xitong Zhang
Rongrong Wang
138
27
0
02 Feb 2023
Simplicity Bias in 1-Hidden Layer Neural Networks
Neural Information Processing Systems (NeurIPS), 2023
Depen Morwani
Jatin Batra
Prateek Jain
Praneeth Netrapalli
162
26
0
01 Feb 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
International Conference on Machine Learning (ICML), 2023
Emmanuel Abbe
Samy Bengio
Aryo Lotfi
Kevin Rizk
LRM
246
61
0
30 Jan 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
International Conference on Machine Learning (ICML), 2023
Jikai Jin
Zhiyuan Li
Kaifeng Lyu
S. Du
Jason D. Lee
MLT
186
44
0
27 Jan 2023
A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix Factorization
JIAN-PENG Cao
Chao Qian
Yihui Huang
Dicheng Chen
Yuncheng Gao
Jiyang Dong
D. Guo
X. Qu
215
1
0
29 Dec 2022
Rank-1 Matrix Completion with Gradient Descent and Small Random Initialization
Neural Information Processing Systems (NeurIPS), 2022
Daesung Kim
Hye Won Chung
156
3
0
19 Dec 2022
On the Ability of Graph Neural Networks to Model Interactions Between Vertices
Neural Information Processing Systems (NeurIPS), 2022
Noam Razin
Tom Verbin
Nadav Cohen
251
16
0
29 Nov 2022
Infinite-width limit of deep linear neural networks
Communications on Pure and Applied Mathematics (CPAM), 2022
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
134
21
0
29 Nov 2022
1
2
3
Next