Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.02666
Cited By
A Variational Analysis of Stochastic Gradient Algorithms
8 February 2016
Stephan Mandt
Matthew D. Hoffman
David M. Blei
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Variational Analysis of Stochastic Gradient Algorithms"
50 / 86 papers shown
Title
Algorithm- and Data-Dependent Generalization Bounds for Score-Based Generative Models
Benjamin Dupuis
Dario Shariatian
Maxime Haddouche
Alain Durmus
Umut Simsekli
56
0
0
04 Jun 2025
Models of Heavy-Tailed Mechanistic Universality
Liam Hodgkinson
Zhichao Wang
Michael W. Mahoney
73
1
0
04 Jun 2025
Stochastic Variational Inference with Tuneable Stochastic Annealing
John Paisley
G. Fazelnia
Brian Barr
51
0
0
04 Apr 2025
Understanding the Generalization Error of Markov algorithms through Poissonization
Benjamin Dupuis
Maxime Haddouche
George Deligiannidis
Umut Simsekli
101
0
0
11 Feb 2025
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Tianjin Huang
Ziquan Zhu
Gaojie Jin
Lu Liu
Zhangyang Wang
Shiwei Liu
119
6
0
12 Jan 2025
Soft Condorcet Optimization for Ranking of General Agents
Marc Lanctot
Kate Larson
Michael Kaisers
Quentin Berthet
I. Gemp
Manfred Diaz
Roberto-Rafael Maura-Rivero
Yoram Bachrach
Anna Koop
Doina Precup
270
0
0
31 Oct 2024
Identifying Drift, Diffusion, and Causal Structure from Temporal Snapshots
Vincent Guan
Joseph Janssen
Hossein Rahmani
Andrew Warren
Stephen X. Zhang
Elina Robeva
Geoffrey Schiebinger
DiffM
138
7
0
30 Oct 2024
Noise-Aware Differentially Private Variational Inference
Talal Alrawajfeh
Hibiki Ito
Antti Honkela
151
0
0
25 Oct 2024
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Siyuan Yu
Wei Chen
H. V. Poor
98
0
0
17 Jun 2024
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
Shuaipeng Li
Penghao Zhao
Hailin Zhang
Xingwu Sun
Hao Wu
...
Zheng Fang
Jinbao Xue
Yangyu Tao
Tengjiao Wang
Di Wang
92
9
0
23 May 2024
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Rodrigo Veiga
Anastasia Remizova
Nicolas Macris
74
1
0
12 Feb 2024
Emergence of heavy tails in homogenized stochastic gradient descent
Zhe Jiao
Martin Keller-Ressel
53
1
0
02 Feb 2024
Understanding the Generalization Benefits of Late Learning Rate Decay
Yinuo Ren
Chao Ma
Lexing Ying
AI4CE
70
6
0
21 Jan 2024
PCDP-SGD: Improving the Convergence of Differentially Private SGD via Projection in Advance
Haichao Sha
Ruixuan Liu
Yi-xiao Liu
Hong Chen
135
1
0
06 Dec 2023
Weight fluctuations in (deep) linear neural networks and a derivation of the inverse-variance flatness relation
Markus Gross
A. Raulf
Christoph Räth
113
0
0
23 Nov 2023
Revisiting Logistic-softmax Likelihood in Bayesian Meta-Learning for Few-Shot Classification
Tianjun Ke
Haoqun Cao
Zenan Ling
Feng Zhou
UQCV
64
8
0
16 Oct 2023
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks
F. Chen
D. Kunin
Atsushi Yamamura
Surya Ganguli
127
29
0
07 Jun 2023
Revisiting the Noise Model of Stochastic Gradient Descent
Barak Battash
Ofir Lindenbaum
56
11
0
05 Mar 2023
(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability
Mathieu Even
Scott Pesme
Suriya Gunasekar
Nicolas Flammarion
83
18
0
17 Feb 2023
Generalization Bounds with Data-dependent Fractal Dimensions
Benjamin Dupuis
George Deligiannidis
Umut cSimcsekli
AI4CE
71
12
0
06 Feb 2023
Improving information retention in large scale online continual learning
Z. Cai
V. Koltun
Ozan Sener
CLL
28
1
0
12 Oct 2022
Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions
Courtney Paquette
Elliot Paquette
Ben Adlam
Jeffrey Pennington
63
14
0
15 Jun 2022
Robust Meta-learning with Sampling Noise and Label Noise via Eigen-Reptile
Dong Chen
Lingfei Wu
Siliang Tang
Xiao Yun
Bo Long
Yueting Zhuang
VLM
NoLa
81
9
0
04 Jun 2022
Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility
Hoileong Lee
Fadhel Ayed
Paul Jung
Juho Lee
Hongseok Yang
François Caron
102
10
0
17 May 2022
On generalization bounds for deep networks based on loss surface implicit regularization
Masaaki Imaizumi
Johannes Schmidt-Hieber
ODL
68
3
0
12 Jan 2022
A Continuous-time Stochastic Gradient Descent Method for Continuous Data
Kexin Jin
J. Latz
Chenguang Liu
Carola-Bibiane Schönlieb
91
9
0
07 Dec 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary regime
Hikaru Ibayashi
Masaaki Imaizumi
97
4
0
07 Nov 2021
On the Hyperparameters in Stochastic Gradient Descent with Momentum
Bin Shi
108
14
0
09 Aug 2021
Communication-Efficient Federated Learning via Predictive Coding
Kai Yue
Richeng Jin
Chau-Wai Wong
H. Dai
FedML
71
14
0
02 Aug 2021
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers
Liam Hodgkinson
Umut Simsekli
Rajiv Khanna
Michael W. Mahoney
77
23
0
02 Aug 2021
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
93
20
0
19 Jul 2021
Structured Stochastic Gradient MCMC
Antonios Alexos
Alex Boyd
Stephan Mandt
BDL
76
13
0
19 Jul 2021
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity
Scott Pesme
Loucas Pillaud-Vivien
Nicolas Flammarion
80
108
0
17 Jun 2021
Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models
Courtney Paquette
Elliot Paquette
ODL
102
14
0
07 Jun 2021
Stochastic gradient descent with noise of machine learning type. Part II: Continuous time analysis
Stephan Wojtowytsch
88
34
0
04 Jun 2021
Combining resampling and reweighting for faithful stochastic optimization
Jing An
Lexing Ying
39
1
0
31 May 2021
FedCor: Correlation-Based Active Client Selection Strategy for Heterogeneous Federated Learning
Minxue Tang
Xuefei Ning
Yitu Wang
Jingwei Sun
Yu Wang
H. Li
Yiran Chen
FedML
89
86
0
24 Mar 2021
Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections
A. Camuto
Xiaoyu Wang
Lingjiong Zhu
Chris Holmes
Mert Gurbuzbalaban
Umut Simsekli
54
16
0
13 Feb 2021
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality
Courtney Paquette
Kiwon Lee
Fabian Pedregosa
Elliot Paquette
59
35
0
08 Feb 2021
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
Pan Zhou
Jiashi Feng
Chao Ma
Caiming Xiong
Guosheng Lin
E. Weinan
111
235
0
12 Oct 2020
Unifying supervised learning and VAEs -- coverage, systematics and goodness-of-fit in normalizing-flow based neural network models for astro-particle reconstructions
T. Glüsenkamp
39
1
0
13 Aug 2020
Communication-Efficient Federated Learning via Optimal Client Sampling
Mónica Ribero
H. Vikalo
FedML
87
95
0
30 Jul 2020
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks
Umut Simsekli
Ozan Sener
George Deligiannidis
Murat A. Erdogdu
86
56
0
16 Jun 2020
The Heavy-Tail Phenomenon in SGD
Mert Gurbuzbalaban
Umut Simsekli
Lingjiong Zhu
59
130
0
08 Jun 2020
Inherent Noise in Gradient Based Methods
Arushi Gupta
19
0
0
26 May 2020
Analysis of Stochastic Gradient Descent in Continuous Time
J. Latz
81
41
0
15 Apr 2020
On Learning Rates and Schrödinger Operators
Bin Shi
Weijie J. Su
Michael I. Jordan
95
61
0
15 Apr 2020
Online Learning in Contextual Bandits using Gated Linear Networks
Eren Sezener
Marcus Hutter
David Budden
Jianan Wang
J. Veness
43
8
0
21 Feb 2020
Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise
Umut Simsekli
Lingjiong Zhu
Yee Whye Teh
Mert Gurbuzbalaban
90
50
0
13 Feb 2020
Searching for Stage-wise Neural Graphs In the Limit
Xiaoxia Zhou
Dejing Dou
Boyang Albert Li
GNN
38
2
0
30 Dec 2019
1
2
Next