ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.05156
  4. Cited By
Implicit Regularization in ReLU Networks with the Square Loss
v1v2v3 (latest)

Implicit Regularization in ReLU Networks with the Square Loss

Annual Conference Computational Learning Theory (COLT), 2020
9 December 2020
Gal Vardi
Ohad Shamir
ArXiv (abs)PDFHTML

Papers citing "Implicit Regularization in ReLU Networks with the Square Loss"

41 / 41 papers shown
The Rich and the Simple: On the Implicit Bias of Adam and SGD
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Bhavya Vasudeva
Jung Whan Lee
Willie Neiswanger
Mahdi Soltanolkotabi
332
7
0
29 May 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
468
23
0
20 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
660
1
0
21 Dec 2024
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression
  of Neural Networks
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
MQ
247
8
0
28 Oct 2024
Approaching Deep Learning through the Spectral Dynamics of Weights
Approaching Deep Learning through the Spectral Dynamics of Weights
David Yunis
Kumar Kshitij Patel
Samuel Wheeler
Pedro H. P. Savarese
Gal Vardi
Karen Livescu
Michael Maire
Matthew R. Walter
387
17
0
21 Aug 2024
Generalization bounds for regression and classification on adaptive
  covering input domains
Generalization bounds for regression and classification on adaptive covering input domains
Wen-Liang Hwang
273
0
0
29 Jul 2024
Get rich quick: exact solutions reveal how unbalanced initializations
  promote rapid feature learning
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learningNeural Information Processing Systems (NeurIPS), 2024
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
378
34
0
10 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras
Peng Wang
Laura Balzano
Qing Qu
AI4CE
334
25
0
06 Jun 2024
ReLUs Are Sufficient for Learning Implicit Neural Representations
ReLUs Are Sufficient for Learning Implicit Neural Representations
Joseph Shenouda
Yamin Zhou
Robert D. Nowak
292
7
0
04 Jun 2024
When does compositional structure yield compositional generalization? A kernel theory
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAICoGe
686
16
0
26 May 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
530
31
0
08 Feb 2024
Implicit biases in multitask and continual learning from a backward
  error analysis perspective
Implicit biases in multitask and continual learning from a backward error analysis perspective
Benoit Dherin
394
3
0
01 Nov 2023
Implicit regularisation in stochastic gradient descent: from
  single-objective to two-player games
Implicit regularisation in stochastic gradient descent: from single-objective to two-player games
Mihaela Rosca
M. Deisenroth
212
2
0
11 Jul 2023
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General
  Losses
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General LossesNeural Information Processing Systems (NeurIPS), 2023
G. Buzaglo
Niv Haim
Gilad Yehudai
Gal Vardi
Yakir Oz
Yaniv Nikankin
Michal Irani
311
25
0
04 Jul 2023
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU
  Networks
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU NetworksInternational Conference on Learning Representations (ICLR), 2023
Mor Shpigel Nacson
Rotem Mulayoff
Greg Ongie
T. Michaeli
Daniel Soudry
337
20
0
30 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias
  for Correlated Inputs
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated InputsNeural Information Processing Systems (NeurIPS), 2023
D. Chistikov
Matthias Englert
R. Lazic
MLT
309
16
0
10 Jun 2023
The Law of Parsimony in Gradient Descent for Learning Deep Linear
  Networks
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks
Can Yaras
Peng Wang
Wei Hu
Zhihui Zhu
Laura Balzano
Qing Qu
364
22
0
01 Jun 2023
Penalising the biases in norm regularisation enforces sparsity
Penalising the biases in norm regularisation enforces sparsityNeural Information Processing Systems (NeurIPS), 2023
Etienne Boursier
Nicolas Flammarion
602
20
0
02 Mar 2023
Transformed Low-Rank Parameterization Can Help Robust Generalization for
  Tensor Neural Networks
Transformed Low-Rank Parameterization Can Help Robust Generalization for Tensor Neural NetworksNeural Information Processing Systems (NeurIPS), 2023
Andong Wang
Chong Li
Mingyuan Bai
Zhong Jin
Guoxu Zhou
Qianchuan Zhao
OODAAML
416
9
0
01 Mar 2023
Guided Deep Kernel Learning
Guided Deep Kernel LearningConference on Uncertainty in Artificial Intelligence (UAI), 2023
Idan Achituve
Gal Chechik
Ethan Fetaya
BDL
344
7
0
19 Feb 2023
Mixed Semi-Supervised Generalized-Linear-Regression with Applications to Deep-Learning and Interpolators
Mixed Semi-Supervised Generalized-Linear-Regression with Applications to Deep-Learning and Interpolators
Yuval Oren
Saharon Rosset
398
1
0
19 Feb 2023
On a continuous time model of gradient descent dynamics and instability
  in deep learning
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
497
14
0
03 Feb 2023
Implicit regularization in Heavy-ball momentum accelerated stochastic
  gradient descent
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descentInternational Conference on Learning Representations (ICLR), 2023
Avrajit Ghosh
He Lyu
Xitong Zhang
Rongrong Wang
287
28
0
02 Feb 2023
On Implicit Bias in Overparameterized Bilevel Optimization
On Implicit Bias in Overparameterized Bilevel OptimizationInternational Conference on Machine Learning (ICML), 2022
Paul Vicol
Jon Lorraine
Fabian Pedregosa
David Duvenaud
Roger C. Grosse
AI4CE
281
47
0
28 Dec 2022
From Gradient Flow on Population Loss to Learning with Stochastic
  Gradient Descent
From Gradient Flow on Population Loss to Learning with Stochastic Gradient DescentNeural Information Processing Systems (NeurIPS), 2022
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
212
5
0
13 Oct 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Magnitude and Angle Dynamics in Training Single ReLU NeuronsNeural Networks (NN), 2022
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
410
6
0
27 Sep 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do
Deep Linear Networks can Benignly Overfit when Shallow Ones DoJournal of machine learning research (JMLR), 2022
Niladri S. Chatterji
Philip M. Long
276
11
0
19 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
On the Implicit Bias in Deep-Learning AlgorithmsCommunications of the ACM (CACM), 2022
Gal Vardi
FedMLAI4CE
432
115
0
26 Aug 2022
Reconstructing Training Data from Trained Neural Networks
Reconstructing Training Data from Trained Neural NetworksNeural Information Processing Systems (NeurIPS), 2022
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
377
175
0
15 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and
  orthogonal inputs
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputsNeural Information Processing Systems (NeurIPS), 2022
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
342
81
0
02 Jun 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU
  Networks: Convergence Guarantees and Implicit Bias
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit BiasNeural Information Processing Systems (NeurIPS), 2022
Itay Safran
Gal Vardi
Jason D. Lee
MLT
284
24
0
18 May 2022
Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks
Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
230
0
0
11 Feb 2022
Implicit Regularization Towards Rank Minimization in ReLU Networks
Implicit Regularization Towards Rank Minimization in ReLU NetworksInternational Conference on Algorithmic Learning Theory (ALT), 2022
Nadav Timor
Gal Vardi
Ohad Shamir
244
67
0
30 Jan 2022
Limitation of Characterizing Implicit Regularization by Data-independent
  Functions
Limitation of Characterizing Implicit Regularization by Data-independent Functions
Leyang Zhang
Z. Xu
Yaoyu Zhang
Yaoyu Zhang
231
0
0
28 Jan 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep
  Convolutional Neural Networks
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural NetworksInternational Conference on Machine Learning (ICML), 2022
Noam Razin
Asaf Maman
Nadav Cohen
480
34
0
27 Jan 2022
On Margin Maximization in Linear and ReLU Networks
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
338
34
0
06 Oct 2021
Continuous vs. Discrete Optimization of Deep Neural Networks
Continuous vs. Discrete Optimization of Deep Neural NetworksNeural Information Processing Systems (NeurIPS), 2021
Omer Elkabetz
Nadav Cohen
343
49
0
14 Jul 2021
Learning a Single Neuron with Bias Using Gradient Descent
Learning a Single Neuron with Bias Using Gradient DescentNeural Information Processing Systems (NeurIPS), 2021
Gal Vardi
Gilad Yehudai
Ohad Shamir
MLT
351
22
0
02 Jun 2021
Implicit Regularization in Tensor Factorization
Implicit Regularization in Tensor FactorizationInternational Conference on Machine Learning (ICML), 2021
Noam Razin
Asaf Maman
Nadav Cohen
401
60
0
19 Feb 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal
  Mirror Descent
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror DescentInternational Conference on Machine Learning (ICML), 2021
Shahar Azulay
E. Moroshko
Mor Shpigel Nacson
Blake E. Woodworth
Nathan Srebro
Amir Globerson
Daniel Soudry
AI4CE
324
84
0
19 Feb 2021
Explicit regularization and implicit bias in deep network classifiers
  trained with the square loss
Explicit regularization and implicit bias in deep network classifiers trained with the square loss
T. Poggio
Q. Liao
221
45
0
31 Dec 2020
1
Page 1 of 1