Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1807.05031
Cited By
On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length
13 July 2018
Stanislaw Jastrzebski
Zachary Kenton
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length"
17 / 17 papers shown
Title
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
66
4
1
25 May 2024
Fairness Without Demographics in Human-Centered Federated Learning
Shaily Roy
Harshit Sharma
Asif Salekin
48
2
0
30 Apr 2024
Accelerating Distributed ML Training via Selective Synchronization
S. Tyagi
Martin Swany
FedML
26
3
0
16 Jul 2023
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
16
6
0
03 Feb 2023
Communication-Efficient Federated Learning for Heterogeneous Edge Devices Based on Adaptive Gradient Quantization
Heting Liu
Fang He
Guohong Cao
FedML
MQ
21
24
0
16 Dec 2022
Learning threshold neurons via the "edge of stability"
Kwangjun Ahn
Sébastien Bubeck
Sinho Chewi
Y. Lee
Felipe Suarez
Yi Zhang
MLT
36
36
0
14 Dec 2022
Understanding the unstable convergence of gradient descent
Kwangjun Ahn
J. Zhang
S. Sra
24
57
0
03 Apr 2022
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
31
15
0
19 Jul 2021
Consensus Control for Decentralized Deep Learning
Lingjing Kong
Tao R. Lin
Anastasia Koloskova
Martin Jaggi
Sebastian U. Stich
19
75
0
09 Feb 2021
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
25
41
0
07 Dec 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
29
48
0
16 Jun 2020
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
42
154
0
21 Feb 2020
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin
Colin Wei
Tengyu Ma
AAML
OOD
36
85
0
09 Oct 2019
GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks
Avraam Chatzimichailidis
Franz-Josef Pfreundt
N. Gauger
J. Keuper
19
10
0
26 Sep 2019
Stiffness: A New Perspective on Generalization in Neural Networks
Stanislav Fort
Pawel Krzysztof Nowak
Stanislaw Jastrzebski
S. Narayanan
19
94
0
28 Jan 2019
Laplacian Smoothing Gradient Descent
Stanley Osher
Bao Wang
Penghang Yin
Xiyang Luo
Farzin Barekat
Minh Pham
A. Lin
ODL
19
43
0
17 Jun 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,889
0
15 Sep 2016
1