Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1710.09430
Cited By
v1
v2 (latest)
A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)
Foundations of Software Technology and Theoretical Computer Science (FSTTCS), 2017
25 October 2017
Prateek Jain
Sham Kakade
Rahul Kidambi
Praneeth Netrapalli
Krishna Pillutla
Aaron Sidford
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)"
29 / 29 papers shown
Title
Seesaw: Accelerating Training by Balancing Learning Rate and Batch Size Scheduling
Alexandru Meterez
Depen Morwani
Jingfeng Wu
Costin-Andrei Oncescu
Cengiz Pehlevan
Sham Kakade
LRM
28
0
0
16 Oct 2025
On the Interplay between Graph Structure and Learning Algorithms in Graph Neural Networks
Junwei Su
Chuan Wu
44
0
0
20 Aug 2025
Improved Scaling Laws in Linear Regression via Data Reuse
Licong Lin
Jingfeng Wu
Peter Bartlett
98
0
0
10 Jun 2025
The Optimality of (Accelerated) SGD for High-Dimensional Quadratic Optimization
Haihan Zhang
Yuanshi Liu
Qianwen Chen
Cong Fang
140
1
0
15 Sep 2024
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Licong Lin
Jingfeng Wu
Sham Kakade
Peter L. Bartlett
Jason D. Lee
LRM
265
30
0
12 Jun 2024
Understanding Forgetting in Continual Learning with Linear Regression
Meng Ding
Kaiyi Ji
Haiyan Zhao
Jinhui Xu
CLL
199
13
0
27 May 2024
Improving Implicit Regularization of SGD with Preconditioning for Least Square Problems
Junwei Su
Difan Zou
Chuan Wu
248
0
0
13 Mar 2024
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
International Conference on Learning Representations (ICLR), 2023
Jingfeng Wu
Difan Zou
Zixiang Chen
Vladimir Braverman
Quanquan Gu
Peter L. Bartlett
293
82
0
12 Oct 2023
Correlated Noise Provably Beats Independent Noise for Differentially Private Learning
International Conference on Learning Representations (ICLR), 2023
Christopher A. Choquette-Choo
Krishnamurthy Dvijotham
Krishna Pillutla
Arun Ganesh
Thomas Steinke
Abhradeep Thakurta
149
20
0
10 Oct 2023
Convergence and concentration properties of constant step-size SGD through Markov chains
Ibrahim Merad
Stéphane Gaïffas
139
6
0
20 Jun 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Annual Conference Computational Learning Theory (COLT), 2023
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
239
104
0
21 Feb 2023
Statistical and Computational Guarantees for Influence Diagnostics
Jillian R. Fisher
Lang Liu
Krishna Pillutla
Y. Choi
Zaïd Harchaoui
TDI
146
0
0
08 Dec 2022
Local SGD in Overparameterized Linear Regression
Mike Nguyen
Charly Kirst
Nicole Mücke
107
0
0
20 Oct 2022
The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift
Neural Information Processing Systems (NeurIPS), 2022
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
Sham Kakade
109
21
0
03 Aug 2022
(Nearly) Optimal Private Linear Regression via Adaptive Clipping
Prateeksha Varshney
Abhradeep Thakurta
Prateek Jain
132
9
0
11 Jul 2022
Provable Generalization of Overparameterized Meta-learning Trained with SGD
Neural Information Processing Systems (NeurIPS), 2022
Yu Huang
Yingbin Liang
Longbo Huang
MLT
168
11
0
18 Jun 2022
Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime
Neural Information Processing Systems (NeurIPS), 2022
Difan Zou
Jingfeng Wu
Vladimir Braverman
Quanquan Gu
Sham Kakade
106
8
0
07 Mar 2022
On the Double Descent of Random Features Models Trained with SGD
Fanghui Liu
Johan A. K. Suykens
Volkan Cevher
MLT
307
11
0
13 Oct 2021
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
International Conference on Machine Learning (ICML), 2021
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
Sham Kakade
288
28
0
12 Oct 2021
The Benefits of Implicit Regularization from SGD in Least Squares Problems
Neural Information Processing Systems (NeurIPS), 2021
Difan Zou
Jingfeng Wu
Vladimir Braverman
Quanquan Gu
Dean Phillips Foster
Sham Kakade
111
34
0
10 Aug 2021
Benign Overfitting of Constant-Stepsize SGD for Linear Regression
Annual Conference Computational Learning Theory (COLT), 2021
Difan Zou
Jingfeng Wu
Vladimir Braverman
Quanquan Gu
Sham Kakade
132
70
0
23 Mar 2021
On the Regularization Effect of Stochastic Gradient Descent applied to Least Squares
Stefan Steinerberger
95
1
0
27 Jul 2020
The Implicit Regularization of Stochastic Gradient Flow for Least Squares
International Conference on Machine Learning (ICML), 2020
Alnur Ali
Guang Cheng
Robert Tibshirani
133
78
0
17 Mar 2020
Robust Aggregation for Federated Learning
IEEE Transactions on Signal Processing (IEEE Trans. Signal Process.), 2019
Krishna Pillutla
Sham Kakade
Zaïd Harchaoui
FedML
236
779
0
31 Dec 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Annual Conference Computational Learning Theory (COLT), 2019
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
306
330
0
01 Aug 2019
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares
Neural Information Processing Systems (NeurIPS), 2019
Rong Ge
Sham Kakade
Rahul Kidambi
Praneeth Netrapalli
226
168
0
29 Apr 2019
Uniform-in-Time Weak Error Analysis for Stochastic Gradient Descent Algorithms via Diffusion Approximation
Communications in Mathematical Sciences (Comm. Math. Sci.), 2019
Yuanyuan Feng
Tingran Gao
Lei Li
Jian‐Guo Liu
Yulong Lu
135
25
0
02 Feb 2019
Iterate averaging as regularization for stochastic gradient descent
Gergely Neu
Lorenzo Rosasco
MoMe
162
61
0
22 Feb 2018
HiGrad: Uncertainty Quantification for Online Learning and Stochastic Approximation
Weijie J. Su
Yuancheng Zhu
180
9
0
13 Feb 2018
1