ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.07802
  4. Cited By
The Implicit Regularization of Stochastic Gradient Flow for Least
  Squares
v1v2 (latest)

The Implicit Regularization of Stochastic Gradient Flow for Least Squares

International Conference on Machine Learning (ICML), 2020
17 March 2020
Alnur Ali
Guang Cheng
Robert Tibshirani
ArXiv (abs)PDFHTML

Papers citing "The Implicit Regularization of Stochastic Gradient Flow for Least Squares"

50 / 93 papers shown
Comparing regularisation paths of (conjugate) gradient estimators in ridge regression
Comparing regularisation paths of (conjugate) gradient estimators in ridge regression
Laura Hucker
Markus Reiß
Thomas Stark
418
1
0
07 Mar 2025
Learning from True-False Labels via Multi-modal Prompt Retrieving
Learning from True-False Labels via Multi-modal Prompt Retrieving
Zhongnian Li
Jinghao Xu
Peng Ying
Meng Wei
Tongfeng Sun
293
1
0
24 May 2024
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution
  for Weak Features
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Rodrigo Veiga
Anastasia Remizova
Nicolas Macris
306
1
0
12 Feb 2024
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning
Beyond Implicit Bias: The Insignificance of SGD Noise in Online LearningInternational Conference on Machine Learning (ICML), 2023
Nikhil Vyas
Depen Morwani
Rosie Zhao
Gal Kaplun
Sham Kakade
Boaz Barak
MLT
328
8
0
14 Jun 2023
Generalized equivalences between subsampling and ridge regularization
Generalized equivalences between subsampling and ridge regularizationNeural Information Processing Systems (NeurIPS), 2023
Pratik V. Patil
Jin-Hong Du
380
6
0
29 May 2023
Doubly Stochastic Models: Learning with Unbiased Label Noises and
  Inference Stability
Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability
Haoyi Xiong
Xuhong Li
Bo Yu
Zhanxing Zhu
Dongrui Wu
Dejing Dou
NoLa
172
0
0
01 Apr 2023
On the Effect of Initialization: The Scaling Path of 2-Layer Neural
  Networks
On the Effect of Initialization: The Scaling Path of 2-Layer Neural NetworksJournal of machine learning research (JMLR), 2023
Sebastian Neumayer
Lénaïc Chizat
M. Unser
285
3
0
31 Mar 2023
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Bhavya Agrawalla
Krishnakumar Balasubramanian
Promit Ghosal
520
1
0
20 Feb 2023
On Implicit Bias in Overparameterized Bilevel Optimization
On Implicit Bias in Overparameterized Bilevel OptimizationInternational Conference on Machine Learning (ICML), 2022
Paul Vicol
Jon Lorraine
Fabian Pedregosa
David Duvenaud
Roger C. Grosse
AI4CE
281
47
0
28 Dec 2022
Toward Equation of Motion for Deep Neural Networks: Continuous-time
  Gradient Descent and Discretization Error Analysis
Toward Equation of Motion for Deep Neural Networks: Continuous-time Gradient Descent and Discretization Error AnalysisNeural Information Processing Systems (NeurIPS), 2022
Taiki Miyagawa
315
12
0
28 Oct 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso
  for quadratic parametrisation
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisationAnnual Conference Computational Learning Theory (COLT), 2022
Loucas Pillaud-Vivien
J. Reygner
Nicolas Flammarion
NoLa
382
42
0
20 Jun 2022
How You Start Matters for Generalization
How You Start Matters for Generalization
Sameera Ramasinghe
L. MacDonald
M. Farazi
Hemanth Saratchandran
Simon Lucey
ODLAI4CE
312
8
0
17 Jun 2022
Algorithmic Stability of Heavy-Tailed Stochastic Gradient Descent on
  Least Squares
Algorithmic Stability of Heavy-Tailed Stochastic Gradient Descent on Least SquaresInternational Conference on Algorithmic Learning Theory (ALT), 2022
Anant Raj
Melih Barsbey
Mert Gurbuzbalaban
Lingjiong Zhu
Umut Simsekli
305
13
0
02 Jun 2022
Sobolev Acceleration and Statistical Optimality for Learning Elliptic
  Equations via Gradient Descent
Sobolev Acceleration and Statistical Optimality for Learning Elliptic Equations via Gradient DescentNeural Information Processing Systems (NeurIPS), 2022
Yiping Lu
Jose H. Blanchet
Lexing Ying
423
11
0
15 May 2022
The Directional Bias Helps Stochastic Gradient Descent to Generalize in
  Kernel Regression Models
The Directional Bias Helps Stochastic Gradient Descent to Generalize in Kernel Regression ModelsInternational Symposium on Information Theory (ISIT), 2022
Yiling Luo
X. Huo
Y. Mei
239
0
0
29 Apr 2022
High-dimensional Asymptotics of Langevin Dynamics in Spiked Matrix
  Models
High-dimensional Asymptotics of Langevin Dynamics in Spiked Matrix ModelsInformation and Inference A Journal of the IMA (JIII), 2022
Tengyuan Liang
Subhabrata Sen
Pragya Sur
215
9
0
09 Apr 2022
On the (Non-)Robustness of Two-Layer Neural Networks in Different
  Learning Regimes
On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes
Elvis Dohmatob
A. Bietti
AAML
422
15
0
22 Mar 2022
A Note on Machine Learning Approach for Computational Imaging
A Note on Machine Learning Approach for Computational Imaging
Bin Dong
239
0
0
24 Feb 2022
On Optimal Early Stopping: Over-informative versus Under-informative
  Parametrization
On Optimal Early Stopping: Over-informative versus Under-informative Parametrization
Ruoqi Shen
Liyao (Mars) Gao
Yi-An Ma
351
16
0
20 Feb 2022
NoisyMix: Boosting Model Robustness to Common Corruptions
NoisyMix: Boosting Model Robustness to Common CorruptionsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
N. Benjamin Erichson
Soon Hoe Lim
Winnie Xu
Francisco Utrera
Ziang Cao
Michael W. Mahoney
380
27
0
02 Feb 2022
Accelerated Gradient Flow: Risk, Stability, and Implicit Regularization
Accelerated Gradient Flow: Risk, Stability, and Implicit Regularization
Yue Sheng
Alnur Ali
334
2
0
20 Jan 2022
The Implicit Regularization of Momentum Gradient Descent with Early
  Stopping
The Implicit Regularization of Momentum Gradient Descent with Early Stopping
Li Wang
Yingcong Zhou
Zhiguo Fu
218
2
0
14 Jan 2022
A Continuous-time Stochastic Gradient Descent Method for Continuous Data
A Continuous-time Stochastic Gradient Descent Method for Continuous Data
Kexin Jin
J. Latz
Chenguang Liu
Carola-Bibiane Schönlieb
227
10
0
07 Dec 2021
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Multi-scale Feature Learning Dynamics: Insights for Double DescentInternational Conference on Machine Learning (ICML), 2021
Mohammad Pezeshki
Amartya Mitra
Yoshua Bengio
Guillaume Lajoie
268
32
0
06 Dec 2021
Does Momentum Change the Implicit Regularization on Separable Data?
Does Momentum Change the Implicit Regularization on Separable Data?Neural Information Processing Systems (NeurIPS), 2021
Bohan Wang
Qi Meng
Huishuai Zhang
Tian Ding
Wei Chen
Zhirui Ma
Tie-Yan Liu
252
25
0
08 Oct 2021
AgFlow: Fast Model Selection of Penalized PCA via Implicit
  Regularization Effects of Gradient Flow
AgFlow: Fast Model Selection of Penalized PCA via Implicit Regularization Effects of Gradient Flow
Haiyan Jiang
Haoyi Xiong
Dongrui Wu
Ji Liu
Dejing Dou
142
2
0
07 Oct 2021
Robust Generalization of Quadratic Neural Networks via Function
  Identification
Robust Generalization of Quadratic Neural Networks via Function Identification
Kan Xu
Hamsa Bastani
Osbert Bastani
OOD
354
9
0
22 Sep 2021
Comparing Classes of Estimators: When does Gradient Descent Beat Ridge
  Regression in Linear Models?
Comparing Classes of Estimators: When does Gradient Descent Beat Ridge Regression in Linear Models?
Dominic Richards
Guang Cheng
Patrick Rebeschini
565
4
0
26 Aug 2021
The Benefits of Implicit Regularization from SGD in Least Squares
  Problems
The Benefits of Implicit Regularization from SGD in Least Squares ProblemsNeural Information Processing Systems (NeurIPS), 2021
Difan Zou
Jingfeng Wu
Vladimir Braverman
Quanquan Gu
Dean Phillips Foster
Sham Kakade
197
37
0
10 Aug 2021
Interpolation can hurt robust generalization even when there is no noise
Interpolation can hurt robust generalization even when there is no noiseNeural Information Processing Systems (NeurIPS), 2021
Konstantin Donhauser
Alexandru cTifrea
Michael Aerni
Reinhard Heckel
Fanny Yang
298
16
0
05 Aug 2021
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations,
  and Anomalous Diffusion
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous DiffusionNeural Computation (Neural Comput.), 2021
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
600
19
0
19 Jul 2021
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of
  Stochasticity
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of StochasticityNeural Information Processing Systems (NeurIPS), 2021
Scott Pesme
Loucas Pillaud-Vivien
Nicolas Flammarion
404
122
0
17 Jun 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization
RATT: Leveraging Unlabeled Data to Guarantee GeneralizationInternational Conference on Machine Learning (ICML), 2021
Saurabh Garg
Sivaraman Balakrishnan
J. Zico Kolter
Zachary Chase Lipton
336
30
0
01 May 2021
Group-Sparse Matrix Factorization for Transfer Learning of Word
  Embeddings
Group-Sparse Matrix Factorization for Transfer Learning of Word EmbeddingsInternational Conference on Machine Learning (ICML), 2021
Kan Xu
Xuanyi Zhao
Hamsa Bastani
Osbert Bastani
264
9
0
18 Apr 2021
Implicit Regularization in Tensor Factorization
Implicit Regularization in Tensor FactorizationInternational Conference on Machine Learning (ICML), 2021
Noam Razin
Asaf Maman
Nadav Cohen
403
60
0
19 Feb 2021
Noisy Recurrent Neural Networks
Noisy Recurrent Neural NetworksNeural Information Processing Systems (NeurIPS), 2021
Soon Hoe Lim
N. Benjamin Erichson
Liam Hodgkinson
Michael W. Mahoney
405
68
0
09 Feb 2021
Implicit bias of deep linear networks in the large learning rate phase
Implicit bias of deep linear networks in the large learning rate phase
Wei Huang
Weitao Du
R. Xu
Chunrui Liu
197
3
0
25 Nov 2020
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent
  with Moderate Learning Rate
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
363
18
0
04 Nov 2020
A Continuous-Time Mirror Descent Approach to Sparse Phase Retrieval
A Continuous-Time Mirror Descent Approach to Sparse Phase RetrievalNeural Information Processing Systems (NeurIPS), 2020
Fan Wu
Patrick Rebeschini
271
14
0
20 Oct 2020
Implicit Gradient Regularization
Implicit Gradient RegularizationInternational Conference on Learning Representations (ICLR), 2020
David Barrett
Benoit Dherin
471
180
0
23 Sep 2020
On Computability, Learnability and Extractability of Finite State
  Machines from Recurrent Neural Networks
On Computability, Learnability and Extractability of Finite State Machines from Recurrent Neural Networks
Reda Marzouk
217
2
0
10 Sep 2020
On the Regularization Effect of Stochastic Gradient Descent applied to
  Least Squares
On the Regularization Effect of Stochastic Gradient Descent applied to Least Squares
Stefan Steinerberger
298
1
0
27 Jul 2020
The Heavy-Tail Phenomenon in SGD
The Heavy-Tail Phenomenon in SGD
Mert Gurbuzbalaban
Umut Simsekli
Lingjiong Zhu
366
163
0
08 Jun 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Noam Razin
Nadav Cohen
381
168
0
13 May 2020
Stochastic gradient descent with random learning rate
Stochastic gradient descent with random learning rate
Daniele Musso
ODL
409
6
0
15 Mar 2020
The Statistical Complexity of Early-Stopped Mirror Descent
The Statistical Complexity of Early-Stopped Mirror DescentNeural Information Processing Systems (NeurIPS), 2020
Tomas Vaskevicius
Varun Kanade
Patrick Rebeschini
341
25
0
01 Feb 2020
Implicit Regularization for Optimal Sparse Recovery
Implicit Regularization for Optimal Sparse RecoveryNeural Information Processing Systems (NeurIPS), 2019
Tomas Vaskevicius
Varun Kanade
Patrick Rebeschini
225
115
0
11 Sep 2019
Stochastic Gradient and Langevin Processes
Stochastic Gradient and Langevin Processes
Xiang Cheng
Dong Yin
Peter L. Bartlett
Sai Li
378
5
0
07 Jul 2019
SGD on Neural Networks Learns Functions of Increasing Complexity
SGD on Neural Networks Learns Functions of Increasing ComplexityNeural Information Processing Systems (NeurIPS), 2019
Preetum Nakkiran
Gal Kaplun
Dimitris Kalimeris
Tristan Yang
Benjamin L. Edelman
Fred Zhang
Boaz Barak
MLT
348
280
0
28 May 2019
Beating SGD Saturation with Tail-Averaging and Minibatching
Beating SGD Saturation with Tail-Averaging and Minibatching
Nicole Mücke
Gergely Neu
Lorenzo Rosasco
350
39
0
22 Feb 2019
12
Next
Page 1 of 2