ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.07288
  4. Cited By
Fast and Faster Convergence of SGD for Over-Parameterized Models and an
  Accelerated Perceptron

Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron

16 October 2018
Sharan Vaswani
Francis R. Bach
Mark W. Schmidt
ArXivPDFHTML

Papers citing "Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron"

50 / 56 papers shown
Title
Better Rates for Random Task Orderings in Continual Linear Models
Better Rates for Random Task Orderings in Continual Linear Models
Itay Evron
Ran Levinstein
Matan Schliserman
Uri Sherman
Tomer Koren
Daniel Soudry
Nathan Srebro
CLL
35
0
0
06 Apr 2025
Nesterov acceleration in benignly non-convex landscapes
Nesterov acceleration in benignly non-convex landscapes
Kanan Gupta
Stephan Wojtowytsch
34
2
0
10 Oct 2024
Convergence Conditions for Stochastic Line Search Based Optimization of
  Over-parametrized Models
Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models
Matteo Lapucci
Davide Pucci
29
1
0
06 Aug 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou
Nicolas Loizou
53
4
0
06 Jun 2024
Demystifying SGD with Doubly Stochastic Gradients
Demystifying SGD with Doubly Stochastic Gradients
Kyurae Kim
Joohwan Ko
Yian Ma
Jacob R. Gardner
48
0
0
03 Jun 2024
Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation
Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation
Aaron Mishkin
Mert Pilanci
Mark Schmidt
62
1
0
03 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
27
3
0
01 Apr 2024
An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced
  linear classification
An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced linear classification
Hyenkyun Woo
15
0
0
26 Dec 2023
Convergence Rates for Stochastic Approximation: Biased Noise with
  Unbounded Variance, and Applications
Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications
R. Karandikar
M. Vidyasagar
25
7
0
05 Dec 2023
On Adaptive Stochastic Optimization for Streaming Data: A Newton's
  Method with O(dN) Operations
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations
Antoine Godichon-Baggioni
Nicklas Werge
ODL
32
3
0
29 Nov 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
39
1
0
29 Nov 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
26
0
0
19 Oct 2023
Communication Compression for Byzantine Robust Learning: New Efficient
  Algorithms and Improved Rates
Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates
Ahmad Rammal
Kaja Gruntkowska
Nikita Fedin
Eduard A. Gorbunov
Peter Richtárik
35
5
0
15 Oct 2023
Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and
  Relaxed Assumptions
Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and Relaxed Assumptions
Bo Wang
Huishuai Zhang
Zhirui Ma
Wei Chen
32
48
0
29 May 2023
First Order Methods with Markovian Noise: from Acceleration to
  Variational Inequalities
First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
Aleksandr Beznosikov
S. Samsonov
Marina Sheshukova
Alexander Gasnikov
A. Naumov
Eric Moulines
32
14
0
25 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable
  Data
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
Single-Call Stochastic Extragradient Methods for Structured Non-monotone
  Variational Inequalities: Improved Analysis under Weaker Conditions
Single-Call Stochastic Extragradient Methods for Structured Non-monotone Variational Inequalities: Improved Analysis under Weaker Conditions
S. Choudhury
Eduard A. Gorbunov
Nicolas Loizou
25
13
0
27 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A
  unifying approach to SGD in two-layers networks
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks
Luca Arnaboldi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
MLT
30
31
0
12 Feb 2023
Adaptive Compression for Communication-Efficient Distributed Training
Adaptive Compression for Communication-Efficient Distributed Training
Maksim Makarenko
Elnur Gasanov
Rustem Islamov
Abdurakhmon Sadiev
Peter Richtárik
18
12
0
31 Oct 2022
Private optimization in the interpolation regime: faster rates and
  hardness results
Private optimization in the interpolation regime: faster rates and hardness results
Hilal Asi
Karan N. Chadha
Gary Cheng
John C. Duchi
32
5
0
31 Oct 2022
Hierarchical Federated Learning with Momentum Acceleration in Multi-Tier
  Networks
Hierarchical Federated Learning with Momentum Acceleration in Multi-Tier Networks
Zhengjie Yang
Sen Fu
Wei Bao
Dong Yuan
Albert Y. Zomaya
FedML
34
5
0
26 Oct 2022
Stability and Generalization for Markov Chain Stochastic Gradient
  Methods
Stability and Generalization for Markov Chain Stochastic Gradient Methods
Puyu Wang
Yunwen Lei
Yiming Ying
Ding-Xuan Zhou
16
18
0
16 Sep 2022
Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix
  Completion
Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion
G. Zhang
Hong-Ming Chiu
Richard Y. Zhang
16
10
0
24 Aug 2022
Adam Can Converge Without Any Modification On Update Rules
Adam Can Converge Without Any Modification On Update Rules
Yushun Zhang
Congliang Chen
Naichen Shi
Ruoyu Sun
Zhimin Luo
18
62
0
20 Aug 2022
Improved Policy Optimization for Online Imitation Learning
Improved Policy Optimization for Online Imitation Learning
J. Lavington
Sharan Vaswani
Mark W. Schmidt
OffRL
13
6
0
29 Jul 2022
On the fast convergence of minibatch heavy ball momentum
On the fast convergence of minibatch heavy ball momentum
Raghu Bollapragada
Tyler Chen
Rachel A. Ward
24
17
0
15 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient
  Algorithms
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient
  Methods
Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient Methods
Aleksandr Beznosikov
Eduard A. Gorbunov
Hugo Berard
Nicolas Loizou
19
47
0
15 Feb 2022
Nesterov Accelerated Shuffling Gradient Method for Convex Optimization
Nesterov Accelerated Shuffling Gradient Method for Convex Optimization
Trang H. Tran
K. Scheinberg
Lam M. Nguyen
35
11
0
07 Feb 2022
A Stochastic Bundle Method for Interpolating Networks
A Stochastic Bundle Method for Interpolating Networks
Alasdair Paren
Leonard Berrada
Rudra P. K. Poudel
M. P. Kumar
24
4
0
29 Jan 2022
HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on
  Heterogeneous Medical Images
HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on Heterogeneous Medical Images
Meirui Jiang
Zirui Wang
Qi Dou
FedML
19
123
0
20 Dec 2021
Convergence of Uncertainty Sampling for Active Learning
Convergence of Uncertainty Sampling for Active Learning
Anant Raj
Francis R. Bach
11
33
0
29 Oct 2021
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants
  via the Mirror Stochastic Polyak Stepsize
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize
Ryan DÓrazio
Nicolas Loizou
I. Laradji
Ioannis Mitliagkas
29
30
0
28 Oct 2021
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic
  Gradient Descent
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent
Sharan Vaswani
Benjamin Dubois-Taine
Reza Babanezhad
43
11
0
21 Oct 2021
Training Deep Neural Networks with Adaptive Momentum Inspired by the
  Quadratic Optimization
Training Deep Neural Networks with Adaptive Momentum Inspired by the Quadratic Optimization
Tao Sun
Huaming Ling
Zuoqiang Shi
Dongsheng Li
Bao Wang
ODL
17
13
0
18 Oct 2021
A general sample complexity analysis of vanilla policy gradient
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
69
62
0
23 Jul 2021
Stochastic Polyak Stepsize with a Moving Target
Stochastic Polyak Stepsize with a Moving Target
Robert Mansel Gower
Aaron Defazio
Michael G. Rabbat
24
17
0
22 Jun 2021
Stochastic gradient descent with noise of machine learning type. Part I:
  Discrete time analysis
Stochastic gradient descent with noise of machine learning type. Part I: Discrete time analysis
Stephan Wojtowytsch
21
50
0
04 May 2021
SVRG Meets AdaGrad: Painless Variance Reduction
SVRG Meets AdaGrad: Painless Variance Reduction
Benjamin Dubois-Taine
Sharan Vaswani
Reza Babanezhad
Mark W. Schmidt
Simon Lacoste-Julien
13
17
0
18 Feb 2021
Convergence of stochastic gradient descent schemes for
  Lojasiewicz-landscapes
Convergence of stochastic gradient descent schemes for Lojasiewicz-landscapes
Steffen Dereich
Sebastian Kassing
26
27
0
16 Feb 2021
On Riemannian Stochastic Approximation Schemes with Fixed Step-Size
On Riemannian Stochastic Approximation Schemes with Fixed Step-Size
Alain Durmus
P. Jiménez
Eric Moulines
Salem Said
13
12
0
15 Feb 2021
Federated Learning with Nesterov Accelerated Gradient
Federated Learning with Nesterov Accelerated Gradient
Zhengjie Yang
Wei Bao
Dong Yuan
Nguyen H. Tran
Albert Y. Zomaya
FedML
19
29
0
18 Sep 2020
Optimization for Supervised Machine Learning: Randomized Algorithms for
  Data and Parameters
Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters
Filip Hanzely
19
0
0
26 Aug 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and
  Interpolation
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
25
74
0
18 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
J. Lee
Tengyu Ma
18
93
0
15 Jun 2020
An Analysis of Constant Step Size SGD in the Non-convex Regime:
  Asymptotic Normality and Bias
An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias
Lu Yu
Krishnakumar Balasubramanian
S. Volgushev
Murat A. Erdogdu
19
50
0
14 Jun 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
  Convergence
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
11
181
0
24 Feb 2020
Towards Label-Free 3D Segmentation of Optical Coherence Tomography
  Images of the Optic Nerve Head Using Deep Learning
Towards Label-Free 3D Segmentation of Optical Coherence Tomography Images of the Optic Nerve Head Using Deep Learning
S. Devalla
T. Pham
S. Panda
Zhang Liang
Giridhar Subramanian
...
L. Schmetterer
S. Perera
Tin Aung
Alexandre Hoang Thiery
M. Girard
25
29
0
22 Feb 2020
Uncertainty Principle for Communication Compression in Distributed and
  Federated Learning and the Search for an Optimal Compressor
Uncertainty Principle for Communication Compression in Distributed and Federated Learning and the Search for an Optimal Compressor
M. Safaryan
Egor Shulgin
Peter Richtárik
24
60
0
20 Feb 2020
Variance Reduced Coordinate Descent with Acceleration: New Method With a
  Surprising Application to Finite-Sum Problems
Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems
Filip Hanzely
D. Kovalev
Peter Richtárik
27
17
0
11 Feb 2020
12
Next