Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.02588
Cited By
Stochastic gradient descent with noise of machine learning type. Part II: Continuous time analysis
4 June 2021
Stephan Wojtowytsch
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stochastic gradient descent with noise of machine learning type. Part II: Continuous time analysis"
13 / 13 papers shown
Title
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Zhanpeng Zhou
Mingze Wang
Yuchen Mao
Bingrui Li
Junchi Yan
AAML
62
0
0
14 Oct 2024
Stochastic Modified Flows for Riemannian Stochastic Gradient Descent
Benjamin Gess
Sebastian Kassing
Nimit Rana
37
0
0
02 Feb 2024
Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances
Marcel Kühn
B. Rosenow
11
3
0
08 Jun 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
13
7
0
19 Feb 2023
Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent
Benjamin Gess
Sebastian Kassing
Vitalii Konarovskyi
DiffM
26
6
0
14 Feb 2023
Toward Equation of Motion for Deep Neural Networks: Continuous-time Gradient Descent and Discretization Error Analysis
Taiki Miyagawa
35
9
0
28 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
24
4
0
13 Oct 2022
Identical Image Retrieval using Deep Learning
Sayan Nath
Nikhil Nayak
VLM
32
1
0
10 May 2022
A Continuous-time Stochastic Gradient Descent Method for Continuous Data
Kexin Jin
J. Latz
Chenguang Liu
Carola-Bibiane Schönlieb
18
9
0
07 Dec 2021
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
88
98
0
13 Oct 2021
Stochastic gradient descent with noise of machine learning type. Part I: Discrete time analysis
Stephan Wojtowytsch
23
50
0
04 May 2021
Convergence of stochastic gradient descent schemes for Lojasiewicz-landscapes
Steffen Dereich
Sebastian Kassing
28
27
0
16 Feb 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,888
0
15 Sep 2016
1