Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.05156
Cited By
v1
v2
v3 (latest)
Implicit Regularization in ReLU Networks with the Square Loss
Annual Conference Computational Learning Theory (COLT), 2020
9 December 2020
Gal Vardi
Ohad Shamir
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Implicit Regularization in ReLU Networks with the Square Loss"
41 / 41 papers shown
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Bhavya Vasudeva
Jung Whan Lee
Willie Neiswanger
Mahdi Soltanolkotabi
332
7
0
29 May 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
468
23
0
20 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
660
1
0
21 Dec 2024
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
MQ
247
8
0
28 Oct 2024
Approaching Deep Learning through the Spectral Dynamics of Weights
David Yunis
Kumar Kshitij Patel
Samuel Wheeler
Pedro H. P. Savarese
Gal Vardi
Karen Livescu
Michael Maire
Matthew R. Walter
387
17
0
21 Aug 2024
Generalization bounds for regression and classification on adaptive covering input domains
Wen-Liang Hwang
273
0
0
29 Jul 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
Neural Information Processing Systems (NeurIPS), 2024
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
378
34
0
10 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras
Peng Wang
Laura Balzano
Qing Qu
AI4CE
334
25
0
06 Jun 2024
ReLUs Are Sufficient for Learning Implicit Neural Representations
Joseph Shenouda
Yamin Zhou
Robert D. Nowak
292
7
0
04 Jun 2024
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
686
16
0
26 May 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
530
31
0
08 Feb 2024
Implicit biases in multitask and continual learning from a backward error analysis perspective
Benoit Dherin
394
3
0
01 Nov 2023
Implicit regularisation in stochastic gradient descent: from single-objective to two-player games
Mihaela Rosca
M. Deisenroth
212
2
0
11 Jul 2023
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses
Neural Information Processing Systems (NeurIPS), 2023
G. Buzaglo
Niv Haim
Gilad Yehudai
Gal Vardi
Yakir Oz
Yaniv Nikankin
Michal Irani
311
25
0
04 Jul 2023
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks
International Conference on Learning Representations (ICLR), 2023
Mor Shpigel Nacson
Rotem Mulayoff
Greg Ongie
T. Michaeli
Daniel Soudry
337
20
0
30 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs
Neural Information Processing Systems (NeurIPS), 2023
D. Chistikov
Matthias Englert
R. Lazic
MLT
309
16
0
10 Jun 2023
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks
Can Yaras
Peng Wang
Wei Hu
Zhihui Zhu
Laura Balzano
Qing Qu
364
22
0
01 Jun 2023
Penalising the biases in norm regularisation enforces sparsity
Neural Information Processing Systems (NeurIPS), 2023
Etienne Boursier
Nicolas Flammarion
602
20
0
02 Mar 2023
Transformed Low-Rank Parameterization Can Help Robust Generalization for Tensor Neural Networks
Neural Information Processing Systems (NeurIPS), 2023
Andong Wang
Chong Li
Mingyuan Bai
Zhong Jin
Guoxu Zhou
Qianchuan Zhao
OOD
AAML
416
9
0
01 Mar 2023
Guided Deep Kernel Learning
Conference on Uncertainty in Artificial Intelligence (UAI), 2023
Idan Achituve
Gal Chechik
Ethan Fetaya
BDL
344
7
0
19 Feb 2023
Mixed Semi-Supervised Generalized-Linear-Regression with Applications to Deep-Learning and Interpolators
Yuval Oren
Saharon Rosset
398
1
0
19 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
497
14
0
03 Feb 2023
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
International Conference on Learning Representations (ICLR), 2023
Avrajit Ghosh
He Lyu
Xitong Zhang
Rongrong Wang
287
28
0
02 Feb 2023
On Implicit Bias in Overparameterized Bilevel Optimization
International Conference on Machine Learning (ICML), 2022
Paul Vicol
Jon Lorraine
Fabian Pedregosa
David Duvenaud
Roger C. Grosse
AI4CE
281
47
0
28 Dec 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Neural Information Processing Systems (NeurIPS), 2022
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
212
5
0
13 Oct 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Neural Networks (NN), 2022
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
410
6
0
27 Sep 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do
Journal of machine learning research (JMLR), 2022
Niladri S. Chatterji
Philip M. Long
276
11
0
19 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Communications of the ACM (CACM), 2022
Gal Vardi
FedML
AI4CE
432
115
0
26 Aug 2022
Reconstructing Training Data from Trained Neural Networks
Neural Information Processing Systems (NeurIPS), 2022
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
377
175
0
15 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Neural Information Processing Systems (NeurIPS), 2022
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
342
81
0
02 Jun 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Neural Information Processing Systems (NeurIPS), 2022
Itay Safran
Gal Vardi
Jason D. Lee
MLT
284
24
0
18 May 2022
Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
230
0
0
11 Feb 2022
Implicit Regularization Towards Rank Minimization in ReLU Networks
International Conference on Algorithmic Learning Theory (ALT), 2022
Nadav Timor
Gal Vardi
Ohad Shamir
244
67
0
30 Jan 2022
Limitation of Characterizing Implicit Regularization by Data-independent Functions
Leyang Zhang
Z. Xu
Yaoyu Zhang
Yaoyu Zhang
231
0
0
28 Jan 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
International Conference on Machine Learning (ICML), 2022
Noam Razin
Asaf Maman
Nadav Cohen
480
34
0
27 Jan 2022
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
338
34
0
06 Oct 2021
Continuous vs. Discrete Optimization of Deep Neural Networks
Neural Information Processing Systems (NeurIPS), 2021
Omer Elkabetz
Nadav Cohen
343
49
0
14 Jul 2021
Learning a Single Neuron with Bias Using Gradient Descent
Neural Information Processing Systems (NeurIPS), 2021
Gal Vardi
Gilad Yehudai
Ohad Shamir
MLT
351
22
0
02 Jun 2021
Implicit Regularization in Tensor Factorization
International Conference on Machine Learning (ICML), 2021
Noam Razin
Asaf Maman
Nadav Cohen
401
60
0
19 Feb 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent
International Conference on Machine Learning (ICML), 2021
Shahar Azulay
E. Moroshko
Mor Shpigel Nacson
Blake E. Woodworth
Nathan Srebro
Amir Globerson
Daniel Soudry
AI4CE
324
84
0
19 Feb 2021
Explicit regularization and implicit bias in deep network classifiers trained with the square loss
T. Poggio
Q. Liao
221
45
0
31 Dec 2020
1
Page 1 of 1