v1v2v3 (latest)

Implicit Regularization in ReLU Networks with the Square Loss

Annual Conference Computational Learning Theory (COLT), 2020

9 December 2020

Gal Vardi

Ohad Shamir

ArXiv (abs)PDF HTML

Papers citing "Implicit Regularization in ReLU Networks with the Square Loss"

41 / 41 papers shown

The Rich and the Simple: On the Implicit Bias of Adam and SGD

338

29 May 2025

Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations

Yize Zhao

Tina Behnia

V. Vakilian

Christos Thrampoulidis

469

20 Feb 2025

Optimization Insights into Deep Diagonal Linear Networks

660

21 Dec 2024

NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks

Yongchang Hao

Yanshuai Cao

Lili Mou

249

28 Oct 2024

Approaching Deep Learning through the Spectral Dynamics of Weights

389

21 Aug 2024

Generalization bounds for regression and classification on adaptive covering input domains

Wen-Liang Hwang

273

29 Jul 2024

Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learningNeural Information Processing Systems (NeurIPS), 2024

Feng Chen

384

10 Jun 2024

Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

Can Yaras

Peng Wang

Laura Balzano

Qing Qu

AI4CE

341

06 Jun 2024

ReLUs Are Sufficient for Learning Implicit Neural Representations

Joseph Shenouda

Yamin Zhou

Robert D. Nowak

295

04 Jun 2024

When does compositional structure yield compositional generalization? A kernel theory

Samuel Lippl

Kim Stachenfeld

NAI CoGe

692

26 May 2024

Implicit Bias and Fast Convergence Rates for Self-attention

Bhavya Vasudeva

Puneesh Deora

Christos Thrampoulidis

531

08 Feb 2024

Implicit biases in multitask and continual learning from a backward error analysis perspective

Benoit Dherin

411

01 Nov 2023

Implicit regularisation in stochastic gradient descent: from single-objective to two-player games

Mihaela Rosca

M. Deisenroth

213

11 Jul 2023

Deconstructing Data Reconstruction: Multiclass, Weight Decay and General LossesNeural Information Processing Systems (NeurIPS), 2023

Gal Vardi

316

04 Jul 2023

The Implicit Bias of Minima Stability in Multivariate Shallow ReLU NetworksInternational Conference on Learning Representations (ICLR), 2023

345

30 Jun 2023

Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated InputsNeural Information Processing Systems (NeurIPS), 2023

322

10 Jun 2023

The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks

Qing Qu

370

01 Jun 2023

Penalising the biases in norm regularisation enforces sparsityNeural Information Processing Systems (NeurIPS), 2023

Etienne Boursier

Nicolas Flammarion

614

02 Mar 2023

Transformed Low-Rank Parameterization Can Help Robust Generalization for Tensor Neural NetworksNeural Information Processing Systems (NeurIPS), 2023

423

01 Mar 2023

Guided Deep Kernel LearningConference on Uncertainty in Artificial Intelligence (UAI), 2023

344

19 Feb 2023

Mixed Semi-Supervised Generalized-Linear-Regression with Applications to Deep-Learning and Interpolators

Yuval Oren

Saharon Rosset

405

19 Feb 2023

On a continuous time model of gradient descent dynamics and instability in deep learning

505

03 Feb 2023

Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descentInternational Conference on Learning Representations (ICLR), 2023

296

02 Feb 2023

On Implicit Bias in Overparameterized Bilevel OptimizationInternational Conference on Machine Learning (ICML), 2022

Paul Vicol

281

28 Dec 2022

From Gradient Flow on Population Loss to Learning with Stochastic Gradient DescentNeural Information Processing Systems (NeurIPS), 2022

218

13 Oct 2022

Magnitude and Angle Dynamics in Training Single ReLU NeuronsNeural Networks (NN), 2022

411

27 Sep 2022

Deep Linear Networks can Benignly Overfit when Shallow Ones DoJournal of machine learning research (JMLR), 2022

Niladri S. Chatterji

Philip M. Long

278

19 Sep 2022

On the Implicit Bias in Deep-Learning AlgorithmsCommunications of the ACM (CACM), 2022

Gal Vardi

FedML AI4CE

445

115

26 Aug 2022

Reconstructing Training Data from Trained Neural NetworksNeural Information Processing Systems (NeurIPS), 2022

Gal Vardi

381

175

15 Jun 2022

Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputsNeural Information Processing Systems (NeurIPS), 2022

Etienne Boursier

Loucas Pillaud-Vivien

Nicolas Flammarion

ODL MLT

342

02 Jun 2022

On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit BiasNeural Information Processing Systems (NeurIPS), 2022

Itay Safran

Gal Vardi

Jason D. Lee

MLT

286

18 May 2022

Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks

234

11 Feb 2022

Implicit Regularization Towards Rank Minimization in ReLU NetworksInternational Conference on Algorithmic Learning Theory (ALT), 2022

Nadav Timor

Gal Vardi

Ohad Shamir

245

30 Jan 2022

Limitation of Characterizing Implicit Regularization by Data-independent Functions

Leyang Zhang

Z. Xu

Yaoyu Zhang

240

28 Jan 2022

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural NetworksInternational Conference on Machine Learning (ICML), 2022

Noam Razin

Asaf Maman

Nadav Cohen

483

27 Jan 2022

On Margin Maximization in Linear and ReLU Networks

Gal Vardi

Ohad Shamir

Nathan Srebro

339

06 Oct 2021

Continuous vs. Discrete Optimization of Deep Neural NetworksNeural Information Processing Systems (NeurIPS), 2021

Omer Elkabetz

Nadav Cohen

350

14 Jul 2021

Learning a Single Neuron with Bias Using Gradient DescentNeural Information Processing Systems (NeurIPS), 2021

Gal Vardi

Gilad Yehudai

Ohad Shamir

MLT

352

02 Jun 2021

Implicit Regularization in Tensor FactorizationInternational Conference on Machine Learning (ICML), 2021

Noam Razin

Asaf Maman

Nadav Cohen

402

19 Feb 2021

On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror DescentInternational Conference on Machine Learning (ICML), 2021

331

19 Feb 2021

Explicit regularization and implicit bias in deep network classifiers trained with the square loss

T. Poggio

Q. Liao

223

31 Dec 2020