ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.03848
  4. Cited By
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
v1v2 (latest)

Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks

4 July 2024
Amit Peleg
Matthias Hein
ArXiv (abs)PDFHTML

Papers citing "Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks"

33 / 33 papers shown
Title
A Modern Look at the Relationship between Sharpness and Generalization
A Modern Look at the Relationship between Sharpness and GeneralizationInternational Conference on Machine Learning (ICML), 2023
Maksym Andriushchenko
Francesco Croce
Maximilian Müller
Matthias Hein
Nicolas Flammarion
3DH
286
81
0
14 Feb 2023
SGD with Large Step Sizes Learns Sparse Features
SGD with Large Step Sizes Learns Sparse FeaturesInternational Conference on Machine Learning (ICML), 2022
Maksym Andriushchenko
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
352
69
0
11 Oct 2022
Investigating Generalization by Controlling Normalized Margin
Investigating Generalization by Controlling Normalized MarginInternational Conference on Machine Learning (ICML), 2022
Alexander R. Farhang
Jeremy Bernstein
Kushal Tirumala
Yang Liu
Yisong Yue
189
6
0
08 May 2022
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
323
81
0
29 Sep 2021
Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving GeneralizationInternational Conference on Learning Representations (ICLR), 2020
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
654
1,632
0
03 Oct 2020
Is SGD a Bayesian sampler? Well, almost
Is SGD a Bayesian sampler? Well, almost
Chris Mingard
Guillermo Valle Pérez
Joar Skalse
A. Louis
BDL
224
62
0
26 Jun 2020
The Pitfalls of Simplicity Bias in Neural Networks
The Pitfalls of Simplicity Bias in Neural NetworksNeural Information Processing Systems (NeurIPS), 2020
Harshay Shah
Kaustav Tamuly
Aditi Raghunathan
Prateek Jain
Praneeth Netrapalli
AAML
309
411
0
13 Jun 2020
Bayesian Deep Learning and a Probabilistic Perspective of Generalization
Bayesian Deep Learning and a Probabilistic Perspective of GeneralizationNeural Information Processing Systems (NeurIPS), 2020
A. Wilson
Pavel Izmailov
UQCVBDLOOD
599
730
0
20 Feb 2020
Understanding Why Neural Networks Generalize Well Through GSNR of
  Parameters
Understanding Why Neural Networks Generalize Well Through GSNR of ParametersInternational Conference on Learning Representations (ICLR), 2020
Jinlong Liu
Guo-qing Jiang
Yunzhi Bai
Ting Chen
Huayan Wang
AI4CE
262
55
0
21 Jan 2020
Fantastic Generalization Measures and Where to Find Them
Fantastic Generalization Measures and Where to Find ThemInternational Conference on Learning Representations (ICLR), 2019
Yiding Jiang
Behnam Neyshabur
H. Mobahi
Dilip Krishnan
Samy Bengio
AI4CE
369
666
0
04 Dec 2019
Towards Explaining the Regularization Effect of Initial Large Learning
  Rate in Training Neural Networks
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural NetworksNeural Information Processing Systems (NeurIPS), 2019
Yuanzhi Li
Colin Wei
Tengyu Ma
275
324
0
10 Jul 2019
Benign Overfitting in Linear Regression
Benign Overfitting in Linear RegressionProceedings of the National Academy of Sciences of the United States of America (PNAS), 2019
Peter L. Bartlett
Philip M. Long
Gábor Lugosi
Alexander Tsigler
MLT
369
849
0
26 Jun 2019
Understanding Generalization through Visualizations
Understanding Generalization through Visualizations
Wenjie Huang
Z. Emam
Micah Goldblum
Liam H. Fowl
J. K. Terry
Furong Huang
Tom Goldstein
AI4CE
267
86
0
07 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Implicit Regularization in Deep Matrix FactorizationNeural Information Processing Systems (NeurIPS), 2019
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
352
556
0
31 May 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient
  Descent
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
566
1,204
0
18 Feb 2019
Reconciling modern machine learning practice and the bias-variance
  trade-off
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
650
1,856
0
28 Dec 2018
A Convergence Theory for Deep Learning via Over-Parameterization
A Convergence Theory for Deep Learning via Over-ParameterizationInternational Conference on Machine Learning (ICML), 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CEODL
880
1,546
0
09 Nov 2018
A Modern Take on the Bias-Variance Tradeoff in Neural Networks
A Modern Take on the Bias-Variance Tradeoff in Neural Networks
Brady Neal
Sarthak Mittal
A. Baratin
Vinayak Tantia
Matthew Scicluna
Damien Scieur
Alexia Jolicoeur-Martineau
207
178
0
19 Oct 2018
Generalization Error in Deep Learning
Generalization Error in Deep Learning
Daniel Jakubovitz
Raja Giryes
M. Rodrigues
AI4CE
409
125
0
03 Aug 2018
Deep learning generalizes because the parameter-function map is biased
  towards simple functions
Deep learning generalizes because the parameter-function map is biased towards simple functions
Guillermo Valle Pérez
Chico Q. Camargo
A. Louis
MLTAI4CE
414
253
0
22 May 2018
Averaging Weights Leads to Wider Optima and Better Generalization
Averaging Weights Leads to Wider Optima and Better GeneralizationConference on Uncertainty in Artificial Intelligence (UAI), 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedMLMoMe
555
1,866
0
14 Mar 2018
Deep Learning Scaling is Predictable, Empirically
Deep Learning Scaling is Predictable, Empirically
Joel Hestness
Sharan Narang
Newsha Ardalani
G. Diamos
Heewoo Jun
Hassan Kianinejad
Md. Mostofa Ali Patwary
Yang Yang
Yanqi Zhou
438
873
0
01 Dec 2017
The Implicit Bias of Gradient Descent on Separable Data
The Implicit Bias of Gradient Descent on Separable DataJournal of machine learning research (JMLR), 2017
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
806
999
0
27 Oct 2017
High-dimensional dynamics of generalization error in neural networks
High-dimensional dynamics of generalization error in neural networks
Madhu S. Advani
Andrew M. Saxe
AI4CE
253
498
0
10 Oct 2017
A Closer Look at Memorization in Deep Networks
A Closer Look at Memorization in Deep Networks
Devansh Arpit
Stanislaw Jastrzebski
Nicolas Ballas
David M. Krueger
Emmanuel Bengio
...
Tegan Maharaj
Asja Fischer
Aaron Courville
Yoshua Bengio
Damien Scieur
TDI
513
2,026
0
16 Jun 2017
Train longer, generalize better: closing the generalization gap in large
  batch training of neural networks
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
411
844
0
24 May 2017
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural
  Networks with Many More Parameters than Training Data
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
Gintare Karolina Dziugaite
Daniel M. Roy
370
881
0
31 Mar 2017
Understanding deep learning requires rethinking generalization
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
693
4,890
0
10 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
1.1K
3,206
0
15 Sep 2016
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
Sebastian Ruder
ODL
968
6,694
0
15 Sep 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
3.6K
215,113
0
10 Dec 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on
  ImageNet Classification
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
VLM
1.0K
19,784
0
06 Feb 2015
In Search of the Real Inductive Bias: On the Role of Implicit
  Regularization in Deep Learning
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep LearningInternational Conference on Learning Representations (ICLR), 2014
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
AI4CE
467
690
0
20 Dec 2014
1