Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.05156
Cited By
v1
v2
v3 (latest)
Implicit Regularization in ReLU Networks with the Square Loss
9 December 2020
Gal Vardi
Ohad Shamir
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Implicit Regularization in ReLU Networks with the Square Loss"
40 / 40 papers shown
Title
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Bhavya Vasudeva
Jung Whan Lee
Vatsal Sharan
Mahdi Soltanolkotabi
17
0
0
29 May 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
191
10
0
20 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
216
1
0
21 Dec 2024
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
MQ
76
4
0
28 Oct 2024
Approaching Deep Learning through the Spectral Dynamics of Weights
David Yunis
Kumar Kshitij Patel
Samuel Wheeler
Pedro H. P. Savarese
Gal Vardi
Karen Livescu
Michael Maire
Matthew R. Walter
104
3
0
21 Aug 2024
Generalization bounds for regression and classification on adaptive covering input domains
Wen-Liang Hwang
67
0
0
29 Jul 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
129
18
0
10 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras
Peng Wang
Laura Balzano
Qing Qu
AI4CE
71
15
0
06 Jun 2024
ReLUs Are Sufficient for Learning Implicit Neural Representations
Joseph Shenouda
Yamin Zhou
Robert D. Nowak
94
6
0
04 Jun 2024
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
249
10
0
26 May 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
107
21
0
08 Feb 2024
Implicit biases in multitask and continual learning from a backward error analysis perspective
Benoit Dherin
104
3
0
01 Nov 2023
Implicit regularisation in stochastic gradient descent: from single-objective to two-player games
Mihaela Rosca
M. Deisenroth
58
2
0
11 Jul 2023
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses
G. Buzaglo
Niv Haim
Gilad Yehudai
Gal Vardi
Yakir Oz
Yaniv Nikankin
Michal Irani
109
15
0
04 Jul 2023
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks
Mor Shpigel Nacson
Rotem Mulayoff
Greg Ongie
T. Michaeli
Daniel Soudry
84
13
0
30 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs
D. Chistikov
Matthias Englert
R. Lazic
MLT
96
12
0
10 Jun 2023
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks
Can Yaras
Peng Wang
Wei Hu
Zhihui Zhu
Laura Balzano
Qing Qu
89
18
0
01 Jun 2023
Penalising the biases in norm regularisation enforces sparsity
Etienne Boursier
Nicolas Flammarion
127
17
0
02 Mar 2023
Transformed Low-Rank Parameterization Can Help Robust Generalization for Tensor Neural Networks
Andong Wang
Chong Li
Mingyuan Bai
Zhong Jin
Guoxu Zhou
Qianchuan Zhao
OOD
AAML
36
5
0
01 Mar 2023
Guided Deep Kernel Learning
Idan Achituve
Gal Chechik
Ethan Fetaya
BDL
66
7
0
19 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
79
10
0
03 Feb 2023
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
Avrajit Ghosh
He Lyu
Xitong Zhang
Rongrong Wang
86
23
0
02 Feb 2023
On Implicit Bias in Overparameterized Bilevel Optimization
Paul Vicol
Jon Lorraine
Fabian Pedregosa
David Duvenaud
Roger C. Grosse
AI4CE
107
38
0
28 Dec 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
42
4
0
13 Oct 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
133
6
0
27 Sep 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do
Niladri S. Chatterji
Philip M. Long
96
8
0
19 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
101
81
0
26 Aug 2022
Reconstructing Training Data from Trained Neural Networks
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
116
141
0
15 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
77
61
0
02 Jun 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
109
24
0
18 May 2022
Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
39
0
0
11 Feb 2022
Implicit Regularization Towards Rank Minimization in ReLU Networks
Nadav Timor
Gal Vardi
Ohad Shamir
93
51
0
30 Jan 2022
Limitation of Characterizing Implicit Regularization by Data-independent Functions
Leyang Zhang
Z. Xu
Yaoyu Zhang
Yaoyu Zhang
33
0
0
28 Jan 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
Noam Razin
Asaf Maman
Nadav Cohen
132
29
0
27 Jan 2022
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
162
30
0
06 Oct 2021
Continuous vs. Discrete Optimization of Deep Neural Networks
Omer Elkabetz
Nadav Cohen
111
44
0
14 Jul 2021
Learning a Single Neuron with Bias Using Gradient Descent
Gal Vardi
Gilad Yehudai
Ohad Shamir
MLT
77
17
0
02 Jun 2021
Implicit Regularization in Tensor Factorization
Noam Razin
Asaf Maman
Nadav Cohen
75
49
0
19 Feb 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent
Shahar Azulay
E. Moroshko
Mor Shpigel Nacson
Blake E. Woodworth
Nathan Srebro
Amir Globerson
Daniel Soudry
AI4CE
89
74
0
19 Feb 2021
Explicit regularization and implicit bias in deep network classifiers trained with the square loss
T. Poggio
Q. Liao
66
42
0
31 Dec 2020
1