ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.00045
  4. Cited By
How Good is SGD with Random Shuffling?
v1v2v3v4 (latest)

How Good is SGD with Random Shuffling?

Annual Conference Computational Learning Theory (COLT), 2019
31 July 2019
Itay Safran
Ohad Shamir
ArXiv (abs)PDFHTML

Papers citing "How Good is SGD with Random Shuffling?"

50 / 60 papers shown
On the Limits of Momentum in Decentralized and Federated Optimization
On the Limits of Momentum in Decentralized and Federated Optimization
Riccardo Zaccone
Sai Praneeth Karimireddy
Carlo Masone
FedML
434
0
0
25 Nov 2025
Aspiration-based Perturbed Learning Automata in Games with Noisy Utility Measurements. Part B: Stochastic Stability in Weakly Acyclic Games
Aspiration-based Perturbed Learning Automata in Games with Noisy Utility Measurements. Part B: Stochastic Stability in Weakly Acyclic GamesEuropean Control Conference (ECC), 2018
Georgios C. Chasparis
149
1
0
23 Nov 2025
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
Daniil Medyakov
Gleb Molodtsov
Grigoriy Evseev
Egor Petrov
Aleksandr Beznosikov
412
3
0
04 Sep 2025
Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
Amit Attia
Matan Schliserman
Uri Sherman
Tomer Koren
555
6
0
15 Jul 2025
Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting
Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting
Duc Toan Nguyen
Trang H. Tran
Lam M. Nguyen
202
0
0
14 Jun 2025
Rapid Overfitting of Multi-Pass Stochastic Gradient Descent in Stochastic Convex Optimization
Rapid Overfitting of Multi-Pass Stochastic Gradient Descent in Stochastic Convex Optimization
Shira Vansover-Hager
Tomer Koren
Roi Livni
247
1
0
13 May 2025
Efficient GNN Training Through Structure-Aware Randomized Mini-Batching
Efficient GNN Training Through Structure-Aware Randomized Mini-Batching
Vignesh Balaji
Christos Kozyrakis
Gal Chechik
Haggai Maron
GNN
285
3
0
25 Apr 2025
Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling
Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling
Daniil Medyakov
Gleb Molodtsov
S. Chezhegov
Alexey Rebrikov
Aleksandr Beznosikov
576
1
0
20 Feb 2025
Provably Faster Algorithms for Bilevel Optimization via
  Without-Replacement Sampling
Provably Faster Algorithms for Bilevel Optimization via Without-Replacement SamplingNeural Information Processing Systems (NeurIPS), 2024
Junyi Li
Heng Huang
290
1
0
07 Nov 2024
Does Worst-Performing Agent Lead the Pack? Analyzing Agent Dynamics in
  Unified Distributed SGD
Does Worst-Performing Agent Lead the Pack? Analyzing Agent Dynamics in Unified Distributed SGDNeural Information Processing Systems (NeurIPS), 2024
Jie Hu
Yi-Ting Ma
Do Young Eun
FedML
374
2
0
26 Sep 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
467
0
0
11 Jun 2024
Demystifying SGD with Doubly Stochastic Gradients
Demystifying SGD with Doubly Stochastic Gradients
Kyurae Kim
Joohwan Ko
Yian Ma
Jacob R. Gardner
454
2
0
03 Jun 2024
Reawakening knowledge: Anticipatory recovery from catastrophic
  interference via structured training
Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured trainingNeural Information Processing Systems (NeurIPS), 2024
Yanlai Yang
Matt Jones
Michael C. Mozer
Mengye Ren
296
4
0
14 Mar 2024
On the Last-Iterate Convergence of Shuffling Gradient Methods
On the Last-Iterate Convergence of Shuffling Gradient MethodsInternational Conference on Machine Learning (ICML), 2024
Zijian Liu
Zhengyuan Zhou
558
8
0
12 Mar 2024
Last Iterate Convergence of Incremental Methods and Applications in
  Continual Learning
Last Iterate Convergence of Incremental Methods and Applications in Continual Learning
Xu Cai
Jelena Diakonikolas
355
7
0
11 Mar 2024
Shuffling Momentum Gradient Algorithm for Convex Optimization
Shuffling Momentum Gradient Algorithm for Convex Optimization
Trang H. Tran
Quoc Tran-Dinh
Lam M. Nguyen
285
2
0
05 Mar 2024
Understanding the Training Speedup from Sampling with Approximate Losses
Understanding the Training Speedup from Sampling with Approximate LossesInternational Conference on Machine Learning (ICML), 2024
Rudrajit Das
Xi Chen
Bertram Ieong
Parikshit Bansal
Sujay Sanghavi
225
4
0
10 Feb 2024
Central Limit Theorem for Two-Timescale Stochastic Approximation with
  Markovian Noise: Theory and Applications
Central Limit Theorem for Two-Timescale Stochastic Approximation with Markovian Noise: Theory and ApplicationsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Jie Hu
Vishwaraj Doshi
Do Young Eun
430
7
0
17 Jan 2024
High Probability Guarantees for Random Reshuffling
High Probability Guarantees for Random Reshuffling
Hengxu Yu
Xiao Li
343
4
0
20 Nov 2023
Convergence of Sign-based Random Reshuffling Algorithms for Nonconvex Optimization
Convergence of Sign-based Random Reshuffling Algorithms for Nonconvex Optimization
Zhen Qin
Zhishuai Liu
Pan Xu
440
3
0
24 Oct 2023
Corgi^2: A Hybrid Offline-Online Approach To Storage-Aware Data
  Shuffling For SGD
Corgi^2: A Hybrid Offline-Online Approach To Storage-Aware Data Shuffling For SGD
Etay Livne
Gal Kaplun
Eran Malach
Shai Shalev-Schwatz
OffRL
230
0
0
04 Sep 2023
Mini-Batch Optimization of Contrastive Loss
Mini-Batch Optimization of Contrastive Loss
Jaewoong Cho
Kartik K. Sreenivasan
Keon Lee
Kyunghoo Mun
Soheun Yi
Jeong-Gwan Lee
Anna Lee
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
SSL
366
10
0
12 Jul 2023
Ordering for Non-Replacement SGD
Ordering for Non-Replacement SGD
Yuetong Xu
Baharan Mirzasoleiman
176
0
0
28 Jun 2023
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective
  and Improved Bounds
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective and Improved Bounds
Xu Cai
Cheuk Yin Lin
Jelena Diakonikolas
FedML
311
6
0
21 Jun 2023
On Convergence of Incremental Gradient for Non-Convex Smooth Functions
On Convergence of Incremental Gradient for Non-Convex Smooth FunctionsInternational Conference on Machine Learning (ICML), 2023
Anastasia Koloskova
N. Doikov
Sebastian U. Stich
Martin Jaggi
423
6
0
30 May 2023
Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning
Repeated Random Sampling for Minimizing the Time-to-Accuracy of LearningInternational Conference on Learning Representations (ICLR), 2023
Patrik Okanovic
R. Waleffe
Vasilis Mageirakos
Konstantinos E. Nikolakakis
Amin Karbasi
Dionysis Kalogerias
Nezihe Merve Gürel
Theodoros Rekatsinas
DD
288
27
0
28 May 2023
Select without Fear: Almost All Mini-Batch Schedules Generalize
  Optimally
Select without Fear: Almost All Mini-Batch Schedules Generalize Optimally
Konstantinos E. Nikolakakis
Amin Karbasi
Dionysis Kalogerias
380
7
0
03 May 2023
High-dimensional limit of one-pass SGD on least squares
High-dimensional limit of one-pass SGD on least squaresElectronic Communications in Probability (ECP), 2023
Elizabeth Collins-Woodfin
Elliot Paquette
409
5
0
13 Apr 2023
Fast Convergence of Random Reshuffling under Over-Parameterization and
  the Polyak-Łojasiewicz Condition
Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Chen Fan
Christos Thrampoulidis
Mark Schmidt
253
2
0
02 Apr 2023
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Tighter Lower Bounds for Shuffling SGD: Random Permutations and BeyondInternational Conference on Machine Learning (ICML), 2023
Jaeyoung Cha
Jaewook Lee
Chulhee Yun
366
26
0
13 Mar 2023
On the Training Instability of Shuffling SGD with Batch Normalization
On the Training Instability of Shuffling SGD with Batch NormalizationInternational Conference on Machine Learning (ICML), 2023
David Wu
Chulhee Yun
S. Sra
495
6
0
24 Feb 2023
On the Convergence of Federated Averaging with Cyclic Client
  Participation
On the Convergence of Federated Averaging with Cyclic Client ParticipationInternational Conference on Machine Learning (ICML), 2023
Yae Jee Cho
Pranay Sharma
Gauri Joshi
Zheng Xu
Satyen Kale
Tong Zhang
FedML
285
47
0
06 Feb 2023
Efficiency Ordering of Stochastic Gradient Descent
Efficiency Ordering of Stochastic Gradient DescentNeural Information Processing Systems (NeurIPS), 2022
Jie Hu
Vishwaraj Doshi
Do Young Eun
248
8
0
15 Sep 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient
  Algorithms
On the Convergence to a Global Solution of Shuffling-Type Gradient AlgorithmsNeural Information Processing Systems (NeurIPS), 2022
Lam M. Nguyen
Trang H. Tran
322
4
0
13 Jun 2022
Stochastic Gradient Descent without Full Data Shuffle
Stochastic Gradient Descent without Full Data ShuffleThe VLDB journal (VLDBJ), 2022
Lijie Xu
Delin Qu
Binhang Yuan
Jiawei Jiang
Cédric Renggli
...
Guoliang Li
Ji Liu
Wentao Wu
Jieping Ye
Ce Zhang
163
7
0
12 Jun 2022
Sampling without Replacement Leads to Faster Rates in Finite-Sum Minimax
  Optimization
Sampling without Replacement Leads to Faster Rates in Finite-Sum Minimax OptimizationNeural Information Processing Systems (NeurIPS), 2022
Aniket Das
Bernhard Schölkopf
Michael Muehlebach
333
9
0
07 Jun 2022
Computing the Variance of Shuffling Stochastic Gradient Algorithms via
  Power Spectral Density Analysis
Computing the Variance of Shuffling Stochastic Gradient Algorithms via Power Spectral Density Analysis
Carles Domingo-Enrich
225
0
0
01 Jun 2022
Federated Random Reshuffling with Compression and Variance Reduction
Federated Random Reshuffling with Compression and Variance Reduction
Grigory Malinovsky
Peter Richtárik
FedML
396
13
0
08 May 2022
Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation
  Regime
Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation RegimeNeural Information Processing Systems (NeurIPS), 2022
Difan Zou
Jingfeng Wu
Vladimir Braverman
Quanquan Gu
Sham Kakade
259
8
0
07 Mar 2022
Benign Underfitting of Stochastic Gradient Descent
Benign Underfitting of Stochastic Gradient DescentNeural Information Processing Systems (NeurIPS), 2022
Tomer Koren
Roi Livni
Yishay Mansour
Uri Sherman
MLT
414
23
0
27 Feb 2022
Nesterov Accelerated Shuffling Gradient Method for Convex Optimization
Nesterov Accelerated Shuffling Gradient Method for Convex OptimizationInternational Conference on Machine Learning (ICML), 2022
Trang H. Tran
K. Scheinberg
Lam M. Nguyen
419
16
0
07 Feb 2022
Characterizing & Finding Good Data Orderings for Fast Convergence of
  Sequential Gradient Methods
Characterizing & Finding Good Data Orderings for Fast Convergence of Sequential Gradient Methods
Amirkeivan Mohtashami
Sebastian U. Stich
Martin Jaggi
324
14
0
03 Feb 2022
A Field Guide to Federated Optimization
A Field Guide to Federated Optimization
Jianyu Wang
Zachary B. Charles
Zheng Xu
Gauri Joshi
H. B. McMahan
...
Mi Zhang
Tong Zhang
Chunxiang Zheng
Chen Zhu
Wennan Zhu
FedML
581
472
0
14 Jul 2021
Optimal Rates for Random Order Online Optimization
Optimal Rates for Random Order Online OptimizationNeural Information Processing Systems (NeurIPS), 2021
Uri Sherman
Tomer Koren
Yishay Mansour
251
10
0
29 Jun 2021
Zeroth-Order Methods for Convex-Concave Minmax Problems: Applications to
  Decision-Dependent Risk Minimization
Zeroth-Order Methods for Convex-Concave Minmax Problems: Applications to Decision-Dependent Risk Minimization
C. Maheshwari
Chih-Yuan Chiu
Eric Mazumdar
S. Shankar Sastry
Lillian J. Ratliff
216
31
0
16 Jun 2021
Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned
  Problems
Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned ProblemsNeural Information Processing Systems (NeurIPS), 2021
Itay Safran
Ohad Shamir
227
21
0
12 Jun 2021
Fast Distributionally Robust Learning with Variance Reduced Min-Max
  Optimization
Fast Distributionally Robust Learning with Variance Reduced Min-Max OptimizationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Yaodong Yu
Tianyi Lin
Eric Mazumdar
Sai Li
OOD
245
29
0
27 Apr 2021
Improved Analysis and Rates for Variance Reduction under
  Without-replacement Sampling Orders
Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders
Xinmeng Huang
Kun Yuan
Xianghui Mao
W. Yin
289
16
0
25 Apr 2021
Random Reshuffling with Variance Reduction: New Analysis and Better
  Rates
Random Reshuffling with Variance Reduction: New Analysis and Better RatesConference on Uncertainty in Artificial Intelligence (UAI), 2021
Grigory Malinovsky
Alibek Sailanbayev
Peter Richtárik
241
26
0
19 Apr 2021
Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
Chulhee Yun
S. Sra
Ali Jadbabaie
236
11
0
12 Mar 2021
12
Next
Page 1 of 2