Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.09080
Cited By
Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process
19 April 2019
Guy Blanc
Neha Gupta
Gregory Valiant
Paul Valiant
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process"
46 / 46 papers shown
Title
Analysis of Overparameterization in Continual Learning under a Linear Model
Daniel Goldfarb
Paul Hand
CLL
39
0
0
11 Feb 2025
Optimization Landscapes Learned: Proxy Networks Boost Convergence in Physics-based Inverse Problems
Girnar Goyal
Philipp Holl
Sweta Agrawal
Nils Thuerey
AI4CE
48
0
0
27 Jan 2025
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Zhanpeng Zhou
Mingze Wang
Yuchen Mao
Bingrui Li
Junchi Yan
AAML
62
0
0
14 Oct 2024
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
71
5
1
25 May 2024
Implicit Bias of AdamW:
ℓ
∞
\ell_\infty
ℓ
∞
Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
50
13
0
05 Apr 2024
Fine-tuning with Very Large Dropout
Jianyu Zhang
Léon Bottou
44
1
0
01 Mar 2024
Generalization Bounds for Label Noise Stochastic Gradient Descent
Jung Eun Huh
Patrick Rebeschini
13
1
0
01 Nov 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
Kaiyue Wen
Zhiyuan Li
Tengyu Ma
FAtt
38
26
0
20 Jul 2023
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
32
6
0
25 May 2023
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
Alexandru Damian
Eshaan Nichani
Rong Ge
Jason D. Lee
MLT
42
33
0
18 May 2023
Fairness Uncertainty Quantification: How certain are you that the model is fair?
Abhishek Roy
P. Mohapatra
24
5
0
27 Apr 2023
Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning
Antonio Sclocchi
Mario Geiger
M. Wyart
40
6
0
31 Jan 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Jikai Jin
Zhiyuan Li
Kaifeng Lyu
S. Du
Jason D. Lee
MLT
54
34
0
27 Jan 2023
Learning useful representations for shifting tasks and distributions
Jianyu Zhang
Léon Bottou
OOD
34
13
0
14 Dec 2022
How Does Sharpness-Aware Minimization Minimize Sharpness?
Kaiyue Wen
Tengyu Ma
Zhiyuan Li
AAML
23
47
0
10 Nov 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
40
49
0
25 Oct 2022
Noise Injection as a Probe of Deep Learning Dynamics
Noam Levi
I. Bloch
M. Freytsis
T. Volansky
40
2
0
24 Oct 2022
SGD with Large Step Sizes Learns Sparse Features
Maksym Andriushchenko
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
45
56
0
11 Oct 2022
Why neural networks find simple solutions: the many regularizers of geometric complexity
Benoit Dherin
Michael Munn
M. Rosca
David Barrett
55
31
0
27 Sep 2022
Deep Double Descent via Smooth Interpolation
Matteo Gamba
Erik Englesson
Marten Bjorkman
Hossein Azizpour
63
11
0
21 Sep 2022
Generalisation under gradient descent via deterministic PAC-Bayes
Eugenio Clerico
Tyler Farghly
George Deligiannidis
Benjamin Guedj
Arnaud Doucet
31
4
0
06 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
34
72
0
26 Aug 2022
Explicit Use of Fourier Spectrum in Generative Adversarial Networks
Soroush Sheikh Gargar
GAN
OOD
29
0
0
02 Aug 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
42
27
0
08 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Jianyi Yang
Shaolei Ren
32
3
0
02 Jul 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation
Loucas Pillaud-Vivien
J. Reygner
Nicolas Flammarion
NoLa
33
31
0
20 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
40
70
0
14 Jun 2022
Towards Understanding Sharpness-Aware Minimization
Maksym Andriushchenko
Nicolas Flammarion
AAML
35
133
0
13 Jun 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
59
23
0
18 May 2022
The Directional Bias Helps Stochastic Gradient Descent to Generalize in Kernel Regression Models
Yiling Luo
X. Huo
Y. Mei
21
0
0
29 Apr 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
25
27
0
24 Apr 2022
Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization
I Zaghloul Amir
Roi Livni
Nathan Srebro
30
6
0
27 Feb 2022
Robust Probabilistic Time Series Forecasting
Taeho Yoon
Youngsuk Park
Ernest K. Ryu
Yuyang Wang
AAML
AI4TS
20
18
0
24 Feb 2022
Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably
Tianyi Liu
Yan Li
Enlu Zhou
Tuo Zhao
38
1
0
07 Feb 2022
Anticorrelated Noise Injection for Improved Generalization
Antonio Orvieto
Hans Kersting
F. Proske
Francis R. Bach
Aurelien Lucchi
58
44
0
06 Feb 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
Noam Razin
Asaf Maman
Nadav Cohen
46
29
0
27 Jan 2022
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization
Aviral Kumar
Rishabh Agarwal
Tengyu Ma
Aaron Courville
George Tucker
Sergey Levine
OffRL
31
65
0
09 Dec 2021
The Geometric Occam's Razor Implicit in Deep Learning
Benoit Dherin
Micheal Munn
David Barrett
22
6
0
30 Nov 2021
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
A. Shevchenko
Vyacheslav Kungurtsev
Marco Mondelli
MLT
41
13
0
03 Nov 2021
Regularization by Misclassification in ReLU Neural Networks
Elisabetta Cornacchia
Jan Hązła
Ido Nachum
Amir Yehudayoff
NoLa
25
2
0
03 Nov 2021
Ridgeless Interpolation with Shallow ReLU Networks in
1
D
1D
1
D
is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions
Boris Hanin
MLT
38
9
0
27 Sep 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization
Tianyi Liu
Yan Li
S. Wei
Enlu Zhou
T. Zhao
21
13
0
24 Feb 2021
Effective Regularization Through Loss-Function Metalearning
Santiago Gonzalez
Risto Miikkulainen
26
5
0
02 Oct 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
J. Lee
Tengyu Ma
29
93
0
15 Jun 2020
Convex Geometry and Duality of Over-parameterized Neural Networks
Tolga Ergen
Mert Pilanci
MLT
42
54
0
25 Feb 2020
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
52
322
0
13 Jun 2019
1