ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.10400
  4. Cited By
Closing the convergence gap of SGD without replacement
v1v2v3v4v5v6 (latest)

Closing the convergence gap of SGD without replacement

International Conference on Machine Learning (ICML), 2020
24 February 2020
Shashank Rajput
Anant Gupta
Dimitris Papailiopoulos
ArXiv (abs)PDFHTML

Papers citing "Closing the convergence gap of SGD without replacement"

50 / 53 papers shown
Towards Continuous-Time Approximations for Stochastic Gradient Descent without Replacement
Towards Continuous-Time Approximations for Stochastic Gradient Descent without Replacement
Stefan Perko
185
0
0
04 Dec 2025
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
Daniil Medyakov
Gleb Molodtsov
Grigoriy Evseev
Egor Petrov
Aleksandr Beznosikov
414
3
0
04 Sep 2025
Proto-EVFL: Enhanced Vertical Federated Learning via Dual Prototype with Extremely Unaligned Data
Proto-EVFL: Enhanced Vertical Federated Learning via Dual Prototype with Extremely Unaligned Data
Wei Guo
Yiyang Duan
Zhaojun Hu
Yiqi Tong
Fuzhen Zhuang
Qi. Wang
Jin Song Dong
R. Wu
Tengfei Liu
Yifan Sun
208
0
0
30 Jul 2025
Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
Amit Attia
Matan Schliserman
Uri Sherman
Tomer Koren
556
6
0
15 Jul 2025
Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting
Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting
Duc Toan Nguyen
Trang H. Tran
Lam M. Nguyen
215
0
0
14 Jun 2025
Rapid Overfitting of Multi-Pass Stochastic Gradient Descent in Stochastic Convex Optimization
Rapid Overfitting of Multi-Pass Stochastic Gradient Descent in Stochastic Convex Optimization
Shira Vansover-Hager
Tomer Koren
Roi Livni
247
1
0
13 May 2025
Low-Rank Thinning
Low-Rank Thinning
Annabelle Michael Carrell
Albert Gong
Abhishek Shetty
Raaz Dwivedi
Lester W. Mackey
562
2
0
17 Feb 2025
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
471
0
0
11 Jun 2024
Stochastic Optimization Algorithms for Instrumental Variable Regression
  with Streaming Data
Stochastic Optimization Algorithms for Instrumental Variable Regression with Streaming Data
Xuxing Chen
Abhishek Roy
Yifan Hu
Krishnakumar Balasubramanian
298
3
0
29 May 2024
On the Last-Iterate Convergence of Shuffling Gradient Methods
On the Last-Iterate Convergence of Shuffling Gradient MethodsInternational Conference on Machine Learning (ICML), 2024
Zijian Liu
Zhengyuan Zhou
571
8
0
12 Mar 2024
Last Iterate Convergence of Incremental Methods and Applications in
  Continual Learning
Last Iterate Convergence of Incremental Methods and Applications in Continual Learning
Xu Cai
Jelena Diakonikolas
355
7
0
11 Mar 2024
Shuffling Momentum Gradient Algorithm for Convex Optimization
Shuffling Momentum Gradient Algorithm for Convex Optimization
Trang H. Tran
Quoc Tran-Dinh
Lam M. Nguyen
298
2
0
05 Mar 2024
High Probability Guarantees for Random Reshuffling
High Probability Guarantees for Random Reshuffling
Hengxu Yu
Xiao Li
353
4
0
20 Nov 2023
Convergence of Sign-based Random Reshuffling Algorithms for Nonconvex Optimization
Convergence of Sign-based Random Reshuffling Algorithms for Nonconvex Optimization
Zhen Qin
Zhishuai Liu
Pan Xu
450
3
0
24 Oct 2023
Mini-Batch Optimization of Contrastive Loss
Mini-Batch Optimization of Contrastive Loss
Jaewoong Cho
Kartik K. Sreenivasan
Keon Lee
Kyunghoo Mun
Soheun Yi
Jeong-Gwan Lee
Anna Lee
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
SSL
368
10
0
12 Jul 2023
Ordering for Non-Replacement SGD
Ordering for Non-Replacement SGD
Yuetong Xu
Baharan Mirzasoleiman
181
0
0
28 Jun 2023
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective
  and Improved Bounds
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective and Improved Bounds
Xu Cai
Cheuk Yin Lin
Jelena Diakonikolas
FedML
315
6
0
21 Jun 2023
On Convergence of Incremental Gradient for Non-Convex Smooth Functions
On Convergence of Incremental Gradient for Non-Convex Smooth FunctionsInternational Conference on Machine Learning (ICML), 2023
Anastasia Koloskova
N. Doikov
Sebastian U. Stich
Martin Jaggi
433
6
0
30 May 2023
Fast Convergence of Random Reshuffling under Over-Parameterization and
  the Polyak-Łojasiewicz Condition
Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Chen Fan
Christos Thrampoulidis
Mark Schmidt
259
2
0
02 Apr 2023
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Tighter Lower Bounds for Shuffling SGD: Random Permutations and BeyondInternational Conference on Machine Learning (ICML), 2023
Jaeyoung Cha
Jaewook Lee
Chulhee Yun
376
26
0
13 Mar 2023
On the Training Instability of Shuffling SGD with Batch Normalization
On the Training Instability of Shuffling SGD with Batch NormalizationInternational Conference on Machine Learning (ICML), 2023
David Wu
Chulhee Yun
S. Sra
513
6
0
24 Feb 2023
On the Convergence of Federated Averaging with Cyclic Client
  Participation
On the Convergence of Federated Averaging with Cyclic Client ParticipationInternational Conference on Machine Learning (ICML), 2023
Yae Jee Cho
Pranay Sharma
Gauri Joshi
Zheng Xu
Satyen Kale
Tong Zhang
FedML
291
48
0
06 Feb 2023
Convergence of ease-controlled Random Reshuffling gradient Algorithms
  under Lipschitz smoothness
Convergence of ease-controlled Random Reshuffling gradient Algorithms under Lipschitz smoothnessComputational optimization and applications (Comput. Optim. Appl.), 2022
R. Seccia
Corrado Coppola
G. Liuzzi
L. Palagi
408
2
0
04 Dec 2022
Adaptive Compression for Communication-Efficient Distributed Training
Adaptive Compression for Communication-Efficient Distributed Training
Maksim Makarenko
Elnur Gasanov
Rustem Islamov
Abdurakhmon Sadiev
Peter Richtárik
366
18
0
31 Oct 2022
Sequential Gradient Descent and Quasi-Newton's Method for Change-Point
  Analysis
Sequential Gradient Descent and Quasi-Newton's Method for Change-Point AnalysisInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Xianyang Zhang
Trisha Dawn
283
3
0
21 Oct 2022
SGDA with shuffling: faster convergence for nonconvex-PŁ minimax
  optimization
SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimizationInternational Conference on Learning Representations (ICLR), 2022
Hanseul Cho
Chulhee Yun
265
9
0
12 Oct 2022
Efficiency Ordering of Stochastic Gradient Descent
Efficiency Ordering of Stochastic Gradient DescentNeural Information Processing Systems (NeurIPS), 2022
Jie Hu
Vishwaraj Doshi
Do Young Eun
254
8
0
15 Sep 2022
Federated Optimization Algorithms with Random Reshuffling and Gradient
  Compression
Federated Optimization Algorithms with Random Reshuffling and Gradient Compression
Abdurakhmon Sadiev
Grigory Malinovsky
Eduard A. Gorbunov
Igor Sokolov
Ahmed Khaled
Konstantin Burlachenko
Peter Richtárik
FedML
510
24
0
14 Jun 2022
Stochastic Gradient Descent without Full Data Shuffle
Stochastic Gradient Descent without Full Data ShuffleThe VLDB journal (VLDBJ), 2022
Lijie Xu
Delin Qu
Binhang Yuan
Jiawei Jiang
Cédric Renggli
...
Guoliang Li
Ji Liu
Wentao Wu
Jieping Ye
Ce Zhang
166
7
0
12 Jun 2022
Sampling without Replacement Leads to Faster Rates in Finite-Sum Minimax
  Optimization
Sampling without Replacement Leads to Faster Rates in Finite-Sum Minimax OptimizationNeural Information Processing Systems (NeurIPS), 2022
Aniket Das
Bernhard Schölkopf
Michael Muehlebach
338
9
0
07 Jun 2022
Computing the Variance of Shuffling Stochastic Gradient Algorithms via
  Power Spectral Density Analysis
Computing the Variance of Shuffling Stochastic Gradient Algorithms via Power Spectral Density Analysis
Carles Domingo-Enrich
232
0
0
01 Jun 2022
Federated Random Reshuffling with Compression and Variance Reduction
Federated Random Reshuffling with Compression and Variance Reduction
Grigory Malinovsky
Peter Richtárik
FedML
400
13
0
08 May 2022
Benign Underfitting of Stochastic Gradient Descent
Benign Underfitting of Stochastic Gradient DescentNeural Information Processing Systems (NeurIPS), 2022
Tomer Koren
Roi Livni
Yishay Mansour
Uri Sherman
MLT
419
23
0
27 Feb 2022
Nesterov Accelerated Shuffling Gradient Method for Convex Optimization
Nesterov Accelerated Shuffling Gradient Method for Convex OptimizationInternational Conference on Machine Learning (ICML), 2022
Trang H. Tran
K. Scheinberg
Lam M. Nguyen
419
16
0
07 Feb 2022
Characterizing & Finding Good Data Orderings for Fast Convergence of
  Sequential Gradient Methods
Characterizing & Finding Good Data Orderings for Fast Convergence of Sequential Gradient Methods
Amirkeivan Mohtashami
Sebastian U. Stich
Martin Jaggi
336
14
0
03 Feb 2022
Optimal Rates for Random Order Online Optimization
Optimal Rates for Random Order Online OptimizationNeural Information Processing Systems (NeurIPS), 2021
Uri Sherman
Tomer Koren
Yishay Mansour
251
10
0
29 Jun 2021
Zeroth-Order Methods for Convex-Concave Minmax Problems: Applications to
  Decision-Dependent Risk Minimization
Zeroth-Order Methods for Convex-Concave Minmax Problems: Applications to Decision-Dependent Risk Minimization
C. Maheshwari
Chih-Yuan Chiu
Eric Mazumdar
S. Shankar Sastry
Lillian J. Ratliff
218
32
0
16 Jun 2021
Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned
  Problems
Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned ProblemsNeural Information Processing Systems (NeurIPS), 2021
Itay Safran
Ohad Shamir
229
21
0
12 Jun 2021
Fast Distributionally Robust Learning with Variance Reduced Min-Max
  Optimization
Fast Distributionally Robust Learning with Variance Reduced Min-Max OptimizationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Yaodong Yu
Tianyi Lin
Eric Mazumdar
Sai Li
OOD
248
29
0
27 Apr 2021
Improved Analysis and Rates for Variance Reduction under
  Without-replacement Sampling Orders
Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders
Xinmeng Huang
Kun Yuan
Xianghui Mao
W. Yin
298
16
0
25 Apr 2021
Random Reshuffling with Variance Reduction: New Analysis and Better
  Rates
Random Reshuffling with Variance Reduction: New Analysis and Better RatesConference on Uncertainty in Artificial Intelligence (UAI), 2021
Grigory Malinovsky
Alibek Sailanbayev
Peter Richtárik
243
26
0
19 Apr 2021
Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
Chulhee Yun
S. Sra
Ali Jadbabaie
243
11
0
12 Mar 2021
Permutation-Based SGD: Is Random Optimal?
Permutation-Based SGD: Is Random Optimal?International Conference on Learning Representations (ICLR), 2021
Shashank Rajput
Kangwook Lee
Dimitris Papailiopoulos
251
15
0
19 Feb 2021
Recent Theoretical Advances in Non-Convex Optimization
Recent Theoretical Advances in Non-Convex Optimization
Marina Danilova
Pavel Dvurechensky
Alexander Gasnikov
Eduard A. Gorbunov
Sergey Guminov
Dmitry Kamzolov
Innokentiy Shibaev
457
111
0
11 Dec 2020
SMG: A Shuffling Gradient-Based Method with Momentum
SMG: A Shuffling Gradient-Based Method with MomentumInternational Conference on Machine Learning (ICML), 2020
Trang H. Tran
Lam M. Nguyen
Quoc Tran-Dinh
435
25
0
24 Nov 2020
Breaking the Communication-Privacy-Accuracy Trilemma
Breaking the Communication-Privacy-Accuracy Trilemma
Wei-Ning Chen
Peter Kairouz
Ayfer Özgür
614
133
0
22 Jul 2020
Incremental Without Replacement Sampling in Nonconvex Optimization
Incremental Without Replacement Sampling in Nonconvex OptimizationJournal of Optimization Theory and Applications (JOTA), 2020
Edouard Pauwels
410
5
0
15 Jul 2020
Variance Reduction via Accelerated Dual Averaging for Finite-Sum
  Optimization
Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization
Chaobing Song
Yong Jiang
Yi-An Ma
525
25
0
18 Jun 2020
SGD with shuffling: optimal rates without component convexity and large
  epoch requirements
SGD with shuffling: optimal rates without component convexity and large epoch requirementsNeural Information Processing Systems (NeurIPS), 2020
Kwangjun Ahn
Chulhee Yun
S. Sra
346
71
0
12 Jun 2020
Random Reshuffling: Simple Analysis with Vast Improvements
Random Reshuffling: Simple Analysis with Vast ImprovementsNeural Information Processing Systems (NeurIPS), 2020
Konstantin Mishchenko
Ahmed Khaled
Peter Richtárik
437
158
0
10 Jun 2020
12
Next
Page 1 of 2