Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.06977
Cited By
On Learning Rates and Schrödinger Operators
15 April 2020
Bin Shi
Weijie J. Su
Michael I. Jordan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On Learning Rates and Schrödinger Operators"
16 / 16 papers shown
Title
FOCUS: First Order Concentrated Updating Scheme
Yizhou Liu
Ziming Liu
Jeff Gore
ODL
108
1
0
21 Jan 2025
A General Continuous-Time Formulation of Stochastic ADMM and Its Variants
Chris Junchi Li
37
0
0
22 Apr 2024
Quantum Langevin Dynamics for Optimization
Zherui Chen
Yuchen Lu
Hao Wang
Yizhou Liu
Tongyang Li
AI4CE
21
10
0
27 Nov 2023
On Underdamped Nesterov's Acceleration
Shu Chen
Bin Shi
Ya-xiang Yuan
27
5
0
28 Apr 2023
Learning Rate Schedules in the Presence of Distribution Shift
Matthew Fahrbach
Adel Javanmard
Vahab Mirrokni
Pratik Worah
24
6
0
27 Mar 2023
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
26
5
0
20 Oct 2022
On Quantum Speedups for Nonconvex Optimization via Quantum Tunneling Walks
Yizhou Liu
Weijie J. Su
Tongyang Li
36
18
0
29 Sep 2022
Gradient Norm Minimization of Nesterov Acceleration:
o
(
1
/
k
3
)
o(1/k^3)
o
(
1/
k
3
)
Shu Chen
Bin Shi
Ya-xiang Yuan
33
15
0
19 Sep 2022
On Uniform Boundedness Properties of SGD and its Momentum Variants
Xiaoyu Wang
M. Johansson
23
3
0
25 Jan 2022
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Jiayao Zhang
Hua Wang
Weijie J. Su
35
7
0
11 Oct 2021
On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)
Zhiyuan Li
Sadhika Malladi
Sanjeev Arora
44
78
0
24 Feb 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training
Cong Fang
Hangfeng He
Qi Long
Weijie J. Su
FAtt
130
168
0
29 Jan 2021
Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge
Chaoyang He
M. Annavaram
A. Avestimehr
FedML
21
23
0
28 Jul 2020
On stochastic mirror descent with interacting particles: convergence properties and variance reduction
Anastasia Borovykh
N. Kantas
P. Parpas
G. Pavliotis
28
12
0
15 Jul 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,892
0
15 Sep 2016
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
Weijie Su
Stephen P. Boyd
Emmanuel J. Candes
108
1,157
0
04 Mar 2015
1