Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process

19 April 2019

Papers citing "Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process"

46 / 46 papers shown

Title
Analysis of Overparameterization in Continual Learning under a Linear Model Daniel Goldfarb Paul Hand CLL 39 0 0 11 Feb 2025
Optimization Landscapes Learned: Proxy Networks Boost Convergence in Physics-based Inverse Problems Girnar Goyal Philipp Holl Sweta Agrawal Nils Thuerey AI4CE 48 0 0 27 Jan 2025
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training Zhanpeng Zhou Mingze Wang Yuchen Mao Bingrui Li Junchi Yan AAML 62 0 0 14 Oct 2024
Does SGD really happen in tiny subspaces? Minhak Song Kwangjun Ahn Chulhee Yun 71 5 1 25 May 2024
$Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization$ Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization Shuo Xie Zhiyuan Li OffRL 50 13 0 05 Apr 2024
Fine-tuning with Very Large Dropout Jianyu Zhang Léon Bottou 44 1 0 01 Mar 2024
Generalization Bounds for Label Noise Stochastic Gradient Descent Jung Eun Huh Patrick Rebeschini 13 1 0 01 Nov 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization Kaiyue Wen Zhiyuan Li Tengyu Ma FAtt 38 26 0 20 Jul 2023
How to escape sharp minima with random perturbations Kwangjun Ahn Ali Jadbabaie S. Sra ODL 32 6 0 25 May 2023
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models Alexandru Damian Eshaan Nichani Rong Ge Jason D. Lee MLT 42 33 0 18 May 2023
Fairness Uncertainty Quantification: How certain are you that the model is fair? Abhishek Roy P. Mohapatra 24 5 0 27 Apr 2023
Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning Antonio Sclocchi Mario Geiger M. Wyart 40 6 0 31 Jan 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing Jikai Jin Zhiyuan Li Kaifeng Lyu S. Du Jason D. Lee MLT 54 34 0 27 Jan 2023
Learning useful representations for shifting tasks and distributions Jianyu Zhang Léon Bottou OOD 34 13 0 14 Dec 2022
How Does Sharpness-Aware Minimization Minimize Sharpness? Kaiyue Wen Tengyu Ma Zhiyuan Li AAML 23 47 0 10 Nov 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models Hong Liu Sang Michael Xie Zhiyuan Li Tengyu Ma AI4CE 40 49 0 25 Oct 2022
Noise Injection as a Probe of Deep Learning Dynamics Noam Levi I. Bloch M. Freytsis T. Volansky 40 2 0 24 Oct 2022
SGD with Large Step Sizes Learns Sparse Features Maksym Andriushchenko Aditya Varre Loucas Pillaud-Vivien Nicolas Flammarion 45 56 0 11 Oct 2022
Why neural networks find simple solutions: the many regularizers of geometric complexity Benoit Dherin Michael Munn M. Rosca David Barrett 55 31 0 27 Sep 2022
Deep Double Descent via Smooth Interpolation Matteo Gamba Erik Englesson Marten Bjorkman Hossein Azizpour 63 11 0 21 Sep 2022
Generalisation under gradient descent via deterministic PAC-Bayes Eugenio Clerico Tyler Farghly George Deligiannidis Benjamin Guedj Arnaud Doucet 31 4 0 06 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms Gal Vardi FedML AI4CE 34 72 0 26 Aug 2022
Explicit Use of Fourier Spectrum in Generative Adversarial Networks Soroush Sheikh Gargar GAN OOD 29 0 0 02 Aug 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent Zhiyuan Li Tianhao Wang Jason D. Lee Sanjeev Arora 42 27 0 08 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity Jianyi Yang Shaolei Ren 32 3 0 02 Jul 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation Loucas Pillaud-Vivien J. Reygner Nicolas Flammarion NoLa 33 31 0 20 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction Kaifeng Lyu Zhiyuan Li Sanjeev Arora FAtt 40 70 0 14 Jun 2022
Towards Understanding Sharpness-Aware Minimization Maksym Andriushchenko Nicolas Flammarion AAML 35 133 0 13 Jun 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias Itay Safran Gal Vardi Jason D. Lee MLT 59 23 0 18 May 2022
The Directional Bias Helps Stochastic Gradient Descent to Generalize in Kernel Regression Models Yiling Luo X. Huo Y. Mei 21 0 0 29 Apr 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes Chao Ma D. Kunin Lei Wu Lexing Ying 25 27 0 24 Apr 2022
Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization I Zaghloul Amir Roi Livni Nathan Srebro 30 6 0 27 Feb 2022
Robust Probabilistic Time Series Forecasting Taeho Yoon Youngsuk Park Ernest K. Ryu Yuyang Wang AAML AI4TS 20 18 0 24 Feb 2022
Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably Tianyi Liu Yan Li Enlu Zhou Tuo Zhao 38 1 0 07 Feb 2022
Anticorrelated Noise Injection for Improved Generalization Antonio Orvieto Hans Kersting F. Proske Francis R. Bach Aurelien Lucchi 58 44 0 06 Feb 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks Noam Razin Asaf Maman Nadav Cohen 46 29 0 27 Jan 2022
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Aviral Kumar Rishabh Agarwal Tengyu Ma Aaron Courville George Tucker Sergey Levine OffRL 31 65 0 09 Dec 2021
The Geometric Occam's Razor Implicit in Deep Learning Benoit Dherin Micheal Munn David Barrett 22 6 0 30 Nov 2021
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks A. Shevchenko Vyacheslav Kungurtsev Marco Mondelli MLT 41 13 0 03 Nov 2021
Regularization by Misclassification in ReLU Neural Networks Elisabetta Cornacchia Jan Hązła Ido Nachum Amir Yehudayoff NoLa 25 2 0 03 Nov 2021
Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions Boris Hanin MLT 38 9 0 27 Sep 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization Tianyi Liu Yan Li S. Wei Enlu Zhou T. Zhao 21 13 0 24 Feb 2021
Effective Regularization Through Loss-Function Metalearning Santiago Gonzalez Risto Miikkulainen 26 5 0 02 Oct 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance Jeff Z. HaoChen Colin Wei J. Lee Tengyu Ma 29 93 0 15 Jun 2020
Convex Geometry and Duality of Over-parameterized Neural Networks Tolga Ergen Mert Pilanci MLT 42 54 0 25 Feb 2020
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks Kaifeng Lyu Jian Li 52 322 0 13 Jun 2019