Implicit Bias of Gradient Descent on Linear Convolutional Networks

1 June 2018

Papers citing "Implicit Bias of Gradient Descent on Linear Convolutional Networks"

30 / 30 papers shown

Title
Training Large Neural Networks With Low-Dimensional Error Feedback Maher Hanut Jonathan Kadmon 78 1 0 27 Feb 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations Yize Zhao Tina Behnia V. Vakilian Christos Thrampoulidis 124 10 0 20 Feb 2025
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks Sholom Schechtman Nicolas Schreuder 383 0 0 08 Feb 2025
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries Chris Kolb T. Weber Bernd Bischl David Rügamer 214 1 0 04 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks Hippolyte Labarrière C. Molinari Lorenzo Rosasco S. Villa Cristian Vega 142 0 0 21 Dec 2024
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning Hojoon Lee Dongyoon Hwang Donghu Kim Hyunseung Kim Jun Jet Tai K. Subramanian Peter R. Wurman Jaegul Choo Peter Stone Takuma Seno OffRL 107 14 0 13 Oct 2024
Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion Zhiwei Bai Jiajie Zhao Yaoyu Zhang AI4CE 55 0 0 22 May 2024
Large-time asymptotics in deep learning Carlos Esteve Borjan Geshkovski Dario Pighin Enrique Zuazua 77 34 0 06 Aug 2020
The Implicit Regularization of Stochastic Gradient Flow for Least Squares Alnur Ali Yan Sun Robert Tibshirani 56 77 0 17 Mar 2020
Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study Assaf Dauber M. Feder Tomer Koren Roi Livni 44 24 0 13 Mar 2020
Risk and parameter convergence of logistic regression Ziwei Ji Matus Telgarsky 38 129 0 20 Mar 2018
Convergence of Gradient Descent on Separable Data Mor Shpigel Nacson Jason D. Lee Suriya Gunasekar Pedro H. P. Savarese Nathan Srebro Daniel Soudry 60 167 0 05 Mar 2018
Characterizing Implicit Bias in Terms of Optimization Geometry Suriya Gunasekar Jason D. Lee Daniel Soudry Nathan Srebro AI4CE 62 404 0 22 Feb 2018
Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations Yuanzhi Li Tengyu Ma Hongyang R. Zhang 53 31 0 26 Dec 2017
Don't Decay the Learning Rate, Increase the Batch Size Samuel L. Smith Pieter-Jan Kindermans Chris Ying Quoc V. Le ODL 93 990 0 01 Nov 2017
The Implicit Bias of Gradient Descent on Separable Data Daniel Soudry Elad Hoffer Mor Shpigel Nacson Suriya Gunasekar Nathan Srebro 74 908 0 27 Oct 2017
Implicit Regularization in Matrix Factorization Suriya Gunasekar Blake E. Woodworth Srinadh Bhojanapalli Behnam Neyshabur Nathan Srebro 65 490 0 25 May 2017
Train longer, generalize better: closing the generalization gap in large batch training of neural networks Elad Hoffer Itay Hubara Daniel Soudry ODL 142 799 0 24 May 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning Ashia Wilson Rebecca Roelofs Mitchell Stern Nathan Srebro Benjamin Recht ODL 50 1,023 0 23 May 2017
Geometry of Optimization and Implicit Regularization in Deep Learning Behnam Neyshabur Ryota Tomioka Ruslan Salakhutdinov Nathan Srebro AI4CE 47 132 0 08 May 2017
The loss surface of deep and wide neural networks Quynh N. Nguyen Matthias Hein ODL 89 284 0 26 Apr 2017
Sharp Minima Can Generalize For Deep Nets Laurent Dinh Razvan Pascanu Samy Bengio Yoshua Bengio ODL 103 766 0 15 Mar 2017
Understanding deep learning requires rethinking generalization Chiyuan Zhang Samy Bengio Moritz Hardt Benjamin Recht Oriol Vinyals HAI 269 4,620 0 10 Nov 2016
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys Pratik Chaudhari A. Choromańska Stefano Soatto Yann LeCun Carlo Baldassi C. Borgs J. Chayes Levent Sagun R. Zecchina ODL 84 769 0 06 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 362 2,922 0 15 Sep 2016
Learning to learn by gradient descent by gradient descent Marcin Andrychowicz Misha Denil Sergio Gomez Colmenarejo Matthew W. Hoffman David Pfau Tom Schaul Brendan Shillingford Nando de Freitas 85 2,000 0 14 Jun 2016
Deep Learning without Poor Local Minima Kenji Kawaguchi ODL 165 922 0 23 May 2016
Path-SGD: Path-Normalized Optimization in Deep Neural Networks Behnam Neyshabur Ruslan Salakhutdinov Nathan Srebro ODL 57 305 0 08 Jun 2015
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning Behnam Neyshabur Ryota Tomioka Nathan Srebro AI4CE 78 655 0 20 Dec 2014
Margins, Shrinkage, and Boosting Matus Telgarsky 52 73 0 18 Mar 2013