ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.12838
  4. Cited By
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning
  Rate Procedure For Least Squares

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares

29 April 2019
Rong Ge
Sham Kakade
Rahul Kidambi
Praneeth Netrapalli
ArXivPDFHTML

Papers citing "The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares"

50 / 69 papers shown
Title
Better Rates for Random Task Orderings in Continual Linear Models
Better Rates for Random Task Orderings in Continual Linear Models
Itay Evron
Ran Levinstein
Matan Schliserman
Uri Sherman
Tomer Koren
Daniel Soudry
Nathan Srebro
CLL
35
0
0
06 Apr 2025
Benefits of Learning Rate Annealing for Tuning-Robustness in Stochastic Optimization
Amit Attia
Tomer Koren
67
1
0
13 Mar 2025
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Shane Bergsma
Nolan Dey
Gurpreet Gosal
Gavia Gray
Daria Soboleva
Joel Hestness
58
5
0
21 Feb 2025
Can Large Language Models Invent Algorithms to Improve Themselves?
Can Large Language Models Invent Algorithms to Improve Themselves?
Yoichi Ishibashi
Taro Yano
Masafumi Oyamada
AIFin
LRM
34
1
0
21 Oct 2024
The Optimality of (Accelerated) SGD for High-Dimensional Quadratic
  Optimization
The Optimality of (Accelerated) SGD for High-Dimensional Quadratic Optimization
Haihan Zhang
Yuanshi Liu
Qianwen Chen
Cong Fang
38
0
0
15 Sep 2024
SnapE -- Training Snapshot Ensembles of Link Prediction Models
SnapE -- Training Snapshot Ensembles of Link Prediction Models
Ali Shaban
Heiko Paulheim
VLM
30
1
0
05 Aug 2024
Rethinking Feature Backbone Fine-tuning for Remote Sensing Object
  Detection
Rethinking Feature Backbone Fine-tuning for Remote Sensing Object Detection
Yechan Kim
JongHyun Park
SooYeon Kim
Moongu Jeon
26
0
0
21 Jul 2024
Large Batch Analysis for Adagrad Under Anisotropic Smoothness
Large Batch Analysis for Adagrad Under Anisotropic Smoothness
Yuxing Liu
Rui Pan
Tong Zhang
26
5
0
21 Jun 2024
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Licong Lin
Jingfeng Wu
Sham Kakade
Peter L. Bartlett
Jason D. Lee
LRM
44
15
0
12 Jun 2024
A Generalized Version of Chung's Lemma and its Applications
A Generalized Version of Chung's Lemma and its Applications
Li Jiang
Xiao Li
Andre Milzarek
Junwen Qiu
45
1
0
09 Jun 2024
Primitive Agentic First-Order Optimization
Primitive Agentic First-Order Optimization
R. Sala
19
0
0
07 Jun 2024
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Vignesh Kothapalli
Tianyu Pang
Shenyang Deng
Zongmin Liu
Yaoqing Yang
37
3
0
07 Jun 2024
New logarithmic step size for stochastic gradient descent
New logarithmic step size for stochastic gradient descent
M. S. Shamaee
S. F. Hafshejani
Z. Saeidian
33
3
0
01 Apr 2024
A Selective Review on Statistical Methods for Massive Data Computation:
  Distributed Computing, Subsampling, and Minibatch Techniques
A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques
Xuetong Li
Yuan Gao
Hong Chang
Danyang Huang
Yingying Ma
...
Ke Xu
Jing Zhou
Xuening Zhu
Yingqiu Zhu
Hansheng Wang
44
7
0
17 Mar 2024
On the Convergence of Federated Learning Algorithms without Data
  Similarity
On the Convergence of Federated Learning Algorithms without Data Similarity
Ali Beikmohammadi
Sarit Khirirat
Sindri Magnússon
FedML
35
1
0
29 Feb 2024
Provably Scalable Black-Box Variational Inference with Structured
  Variational Families
Provably Scalable Black-Box Variational Inference with Structured Variational Families
Joohwan Ko
Kyurae Kim
W. Kim
Jacob R. Gardner
BDL
33
2
0
19 Jan 2024
DREAM: Debugging and Repairing AutoML Pipelines
DREAM: Debugging and Repairing AutoML Pipelines
Xiaoyu Zhang
Juan Zhai
Shiqing Ma
Chao Shen
21
1
0
31 Dec 2023
An investigation of belief-free DRL and MCTS for inspection and
  maintenance planning
An investigation of belief-free DRL and MCTS for inspection and maintenance planning
Daniel Koutas
E. Bismut
Daniel Straub
19
2
0
22 Dec 2023
Accelerated Convergence of Stochastic Heavy Ball Method under
  Anisotropic Gradient Noise
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise
Rui Pan
Yuxing Liu
Xiaoyu Wang
Tong Zhang
23
5
0
22 Dec 2023
On the Role of Server Momentum in Federated Learning
On the Role of Server Momentum in Federated Learning
Jianhui Sun
Xidong Wu
Heng-Chiao Huang
Aidong Zhang
FedML
60
11
0
19 Dec 2023
An Automatic Learning Rate Schedule Algorithm for Achieving Faster
  Convergence and Steeper Descent
An Automatic Learning Rate Schedule Algorithm for Achieving Faster Convergence and Steeper Descent
Zhao-quan Song
Chiwun Yang
29
9
0
17 Oct 2023
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear
  Regression?
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
Jingfeng Wu
Difan Zou
Zixiang Chen
Vladimir Braverman
Quanquan Gu
Peter L. Bartlett
125
49
0
12 Oct 2023
Team AcieLee: Technical Report for EPIC-SOUNDS Audio-Based Interaction
  Recognition Challenge 2023
Team AcieLee: Technical Report for EPIC-SOUNDS Audio-Based Interaction Recognition Challenge 2023
Yuqi Li
Yi-Jhen Luo
Xiaoshuai Hao
Chuanguang Yang
Zhulin An
Dantong Song
Wei Yi
33
0
0
15 Jun 2023
Overcoming Catastrophic Forgetting in Massively Multilingual Continual
  Learning
Overcoming Catastrophic Forgetting in Massively Multilingual Continual Learning
Genta Indra Winata
Lingjue Xie
Karthik Radhakrishnan
Shijie Wu
Xisen Jin
Pengxiang Cheng
Mayank Kulkarni
Daniel Preotiuc-Pietro
CLL
18
18
0
25 May 2023
Fast and Straggler-Tolerant Distributed SGD with Reduced Computation
  Load
Fast and Straggler-Tolerant Distributed SGD with Reduced Computation Load
Maximilian Egger
Serge Kas Hanna
Rawad Bitar
FedML
24
0
0
17 Apr 2023
Learning Rate Schedules in the Presence of Distribution Shift
Learning Rate Schedules in the Presence of Distribution Shift
Matthew Fahrbach
Adel Javanmard
Vahab Mirrokni
Pratik Worah
24
6
0
27 Mar 2023
Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron
Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron
Jingfeng Wu
Difan Zou
Zixiang Chen
Vladimir Braverman
Quanquan Gu
Sham Kakade
90
6
0
03 Mar 2023
Real-Time Damage Detection in Fiber Lifting Ropes Using Convolutional
  Neural Networks
Real-Time Damage Detection in Fiber Lifting Ropes Using Convolutional Neural Networks
Tuomas Jalonen
M. A. Sa'd
Roope Mellanen
S. Kiranyaz
Moncef Gabbouj
11
4
0
23 Feb 2023
Optimizing Learning Rate Schedules for Iterative Pruning of Deep Neural
  Networks
Optimizing Learning Rate Schedules for Iterative Pruning of Deep Neural Networks
Shiyu Liu
Rohan Ghosh
John Tan Chong Min
Mehul Motani
37
0
0
09 Dec 2022
The Power and Limitation of Pretraining-Finetuning for Linear Regression
  under Covariate Shift
The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
Sham Kakade
6
16
0
03 Aug 2022
Implicit Regularization or Implicit Conditioning? Exact Risk
  Trajectories of SGD in High Dimensions
Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions
Courtney Paquette
Elliot Paquette
Ben Adlam
Jeffrey Pennington
17
13
0
15 Jun 2022
Towards an AI-Driven Universal Anti-Jamming Solution with Convolutional
  Interference Cancellation Network
Towards an AI-Driven Universal Anti-Jamming Solution with Convolutional Interference Cancellation Network
H. N. Nguyen
G. Noubir
14
1
0
18 Mar 2022
Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation
  Regime
Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime
Difan Zou
Jingfeng Wu
Vladimir Braverman
Quanquan Gu
Sham Kakade
11
5
0
07 Mar 2022
Optimal learning rate schedules in high-dimensional non-convex
  optimization problems
Optimal learning rate schedules in high-dimensional non-convex optimization problems
Stéphane dÁscoli
Maria Refinetti
Giulio Biroli
16
7
0
09 Feb 2022
On Uniform Boundedness Properties of SGD and its Momentum Variants
On Uniform Boundedness Properties of SGD and its Momentum Variants
Xiaoyu Wang
M. Johansson
23
3
0
25 Jan 2022
Optimization Planning for 3D ConvNets
Optimization Planning for 3D ConvNets
Zhaofan Qiu
Ting Yao
Chong-Wah Ngo
Tao Mei
3DPC
3DH
32
9
0
11 Jan 2022
Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic
  Objectives with Skewed Hessian Spectrums
Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums
Rui Pan
Haishan Ye
Tong Zhang
14
14
0
27 Oct 2021
S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based
  Networks
S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based Networks
Shiyu Liu
Chong Min John Tan
Mehul Motani
CLL
29
4
0
17 Oct 2021
Adaptive Differentially Private Empirical Risk Minimization
Adaptive Differentially Private Empirical Risk Minimization
Xiaoxia Wu
Lingxiao Wang
Irina Cristali
Quanquan Gu
Rebecca Willett
38
6
0
14 Oct 2021
Last Iterate Risk Bounds of SGD with Decaying Stepsize for
  Overparameterized Linear Regression
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
Sham Kakade
104
20
0
12 Oct 2021
Towards Continual Entity Learning in Language Models for Conversational
  Agents
Towards Continual Entity Learning in Language Models for Conversational Agents
R. Gadde
I. Bulyko
KELM
14
1
0
30 Jul 2021
Bandwidth-based Step-Sizes for Non-Convex Stochastic Optimization
Bandwidth-based Step-Sizes for Non-Convex Stochastic Optimization
Xiaoyu Wang
M. Johansson
13
2
0
05 Jun 2021
Acceleration via Fractal Learning Rate Schedules
Acceleration via Fractal Learning Rate Schedules
Naman Agarwal
Surbhi Goel
Cyril Zhang
16
18
0
01 Mar 2021
A Biased Graph Neural Network Sampler with Near-Optimal Regret
A Biased Graph Neural Network Sampler with Near-Optimal Regret
Qingru Zhang
David Wipf
Quan Gan
Le Song
40
24
0
01 Mar 2021
On the Convergence of Step Decay Step-Size for Stochastic Optimization
On the Convergence of Step Decay Step-Size for Stochastic Optimization
Xiaoyu Wang
Sindri Magnússon
M. Johansson
66
23
0
18 Feb 2021
Last iterate convergence of SGD for Least-Squares in the Interpolation
  regime
Last iterate convergence of SGD for Least-Squares in the Interpolation regime
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
12
34
0
05 Feb 2021
Advances in Electron Microscopy with Deep Learning
Advances in Electron Microscopy with Deep Learning
Jeffrey M. Ede
32
2
0
04 Jan 2021
Reverse engineering learned optimizers reveals known and novel
  mechanisms
Reverse engineering learned optimizers reveals known and novel mechanisms
Niru Maheswaranathan
David Sussillo
Luke Metz
Ruoxi Sun
Jascha Narain Sohl-Dickstein
22
21
0
04 Nov 2020
Review: Deep Learning in Electron Microscopy
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
34
79
0
17 Sep 2020
Understanding and Detecting Convergence for Stochastic Gradient Descent
  with Momentum
Understanding and Detecting Convergence for Stochastic Gradient Descent with Momentum
Jerry Chee
Ping Li
6
11
0
27 Aug 2020
12
Next