Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1509.01240
Cited By
Train faster, generalize better: Stability of stochastic gradient descent
3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Train faster, generalize better: Stability of stochastic gradient descent"
50 / 194 papers shown
Title
Stability Regularized Cross-Validation
Ryan Cory-Wright
A. Gómez
24
0
0
11 May 2025
Gradient Descent as a Shrinkage Operator for Spectral Bias
Simon Lucey
38
0
0
25 Apr 2025
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
Xinyu Zhou
Simin Fan
Martin Jaggi
Jie Fu
23
0
0
24 Apr 2025
Leave-One-Out Stable Conformal Prediction
Kiljae Lee
Yuan Zhang
34
0
0
16 Apr 2025
Better Rates for Random Task Orderings in Continual Linear Models
Itay Evron
Ran Levinstein
Matan Schliserman
Uri Sherman
Tomer Koren
Daniel Soudry
Nathan Srebro
CLL
35
0
0
06 Apr 2025
Randomized Pairwise Learning with Adaptive Sampling: A PAC-Bayes Analysis
Sijia Zhou
Yunwen Lei
Ata Kabán
29
0
0
03 Apr 2025
Learning Variational Inequalities from Data: Fast Generalization Rates under Strong Monotonicity
Eric Zhao
Tatjana Chavdarova
Michael I. Jordan
45
0
0
20 Feb 2025
Stability-based Generalization Bounds for Variational Inference
Yadi Wei
R. Khardon
BDL
44
0
0
17 Feb 2025
Understanding the Generalization Error of Markov algorithms through Poissonization
Benjamin Dupuis
Maxime Haddouche
George Deligiannidis
Umut Simsekli
44
0
0
11 Feb 2025
Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization
Dun Zeng
Zheshun Wu
Shiyu Liu
Yu Pan
Xiaoying Tang
Zenglin Xu
MLT
FedML
87
1
0
25 Nov 2024
Understanding Generalization in Quantum Machine Learning with Margins
Tak Hur
Daniel K. Park
AI4CE
26
1
0
11 Nov 2024
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Aleksandar Armacki
Shuhua Yu
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
55
2
0
17 Oct 2024
Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods
Hossein Taheri
Christos Thrampoulidis
Arya Mazumdar
MLT
33
0
0
13 Oct 2024
OledFL: Unleashing the Potential of Decentralized Federated Learning via Opposite Lookahead Enhancement
Qinglun Li
Miao Zhang
Mengzhu Wang
Quanjun Yin
Li Shen
OODD
FedML
19
0
0
09 Oct 2024
How Much Can We Forget about Data Contamination?
Sebastian Bordt
Suraj Srinivas
Valentyn Boreiko
U. V. Luxburg
45
1
0
04 Oct 2024
A-FedPD: Aligning Dual-Drift is All Federated Primal-Dual Learning Needs
Yan Sun
Li Shen
Dacheng Tao
FedML
25
0
0
27 Sep 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
44
0
0
11 Jun 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou
Nicolas Loizou
53
4
0
06 Jun 2024
Uniformly Stable Algorithms for Adversarial Training and Beyond
Jiancong Xiao
Jiawei Zhang
Zhimin Luo
Asuman Ozdaglar
AAML
40
0
0
03 May 2024
The Sample Complexity of Gradient Descent in Stochastic Convex Optimization
Roi Livni
MLT
29
1
0
07 Apr 2024
Statistical Mechanics and Artificial Neural Networks: Principles, Models, and Applications
Lucas Böttcher
Gregory R. Wheeler
32
0
0
05 Apr 2024
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size
P. Ostroukhov
Aigerim Zhumabayeva
Chulu Xiang
Alexander Gasnikov
Martin Takáč
Dmitry Kamzolov
ODL
43
2
0
07 Feb 2024
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift
Renchunzi Xie
Ambroise Odonnat
Vasilii Feofanov
I. Redko
Jianfeng Zhang
Bo An
UQCV
72
1
0
17 Jan 2024
Convex SGD: Generalization Without Early Stopping
Julien Hendrickx
A. Olshevsky
MLT
LRM
25
1
0
08 Jan 2024
Generalization Bounds for Label Noise Stochastic Gradient Descent
Jung Eun Huh
Patrick Rebeschini
13
1
0
01 Nov 2023
Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm
Miaoxi Zhu
Li Shen
Bo Du
Dacheng Tao
18
6
0
31 Oct 2023
Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization
Liang Zhang
Junchi Yang
Amin Karbasi
Niao He
24
2
0
26 Oct 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
26
0
0
19 Oct 2023
Differentially Private Non-convex Learning for Multi-layer Neural Networks
Hanpu Shen
Cheng-Long Wang
Zihang Xiang
Yiming Ying
Di Wang
35
7
0
12 Oct 2023
Adversarial Style Transfer for Robust Policy Optimization in Deep Reinforcement Learning
Md Masudur Rahman
Yexiang Xue
23
4
0
29 Aug 2023
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
Minghao Yang
Xiyuan Wei
Tianbao Yang
Yiming Ying
34
1
0
07 Jul 2023
Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning
Kuan-Fu Ding
Jingyang Li
Kim-Chuan Toh
25
8
0
26 Jun 2023
Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning
Patrik Okanovic
R. Waleffe
Vasilis Mageirakos
Konstantinos E. Nikolakakis
Amin Karbasi
Dionysis Kalogerias
Nezihe Merve Gürel
Theodoros Rekatsinas
DD
39
12
0
28 May 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
27
3
0
26 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent
Lingjiong Zhu
Mert Gurbuzbalaban
Anant Raj
Umut Simsekli
24
6
0
20 May 2023
Is Aggregation the Only Choice? Federated Learning via Layer-wise Model Recombination
Ming Hu
Zhihao Yue
Zhiwei Ling
Cheng Chen
Yihao Huang
Xian Wei
Xiang Lian
Yang Liu
Mingsong Chen
FedML
19
8
0
18 May 2023
Learning Trajectories are Generalization Indicators
Jingwen Fu
Zhizheng Zhang
Dacheng Yin
Yan Lu
Nanning Zheng
AI4CE
28
3
0
25 Apr 2023
Differentially Private Stochastic Convex Optimization in (Non)-Euclidean Space Revisited
Jinyan Su
Changhong Zhao
Di Wang
12
3
0
31 Mar 2023
Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD than Constant Stepsize
Mert Gurbuzbalaban
Yuanhan Hu
Umut Simsekli
Lingjiong Zhu
LRM
11
1
0
10 Feb 2023
Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion
Haotian Ju
Dongyue Li
Aneesh Sharma
Hongyang R. Zhang
23
40
0
09 Feb 2023
U-Clip: On-Average Unbiased Stochastic Gradient Clipping
Bryn Elesedy
Marcus Hutter
11
1
0
06 Feb 2023
Efficient Gradient Approximation Method for Constrained Bilevel Optimization
Siyuan Xu
Minghui Zhu
19
19
0
03 Feb 2023
Bagging Provides Assumption-free Stability
Jake A. Soloff
Rina Foygel Barber
Rebecca Willett
19
9
0
30 Jan 2023
On the Lipschitz Constant of Deep Networks and Double Descent
Matteo Gamba
Hossein Azizpour
Marten Bjorkman
21
7
0
28 Jan 2023
Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions
Anant Raj
Lingjiong Zhu
Mert Gurbuzbalaban
Umut Simsekli
21
15
0
27 Jan 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Jikai Jin
Zhiyuan Li
Kaifeng Lyu
S. Du
Jason D. Lee
MLT
48
34
0
27 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
Sharper Analysis for Minibatch Stochastic Proximal Point Methods: Stability, Smoothness, and Deviation
Xiao-Tong Yuan
P. Li
32
2
0
09 Jan 2023
Resampling Sensitivity of High-Dimensional PCA
Haoyu Wang
21
0
0
30 Dec 2022
1
2
3
4
Next