Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.12430
Cited By
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization
24 February 2021
Tianyi Liu
Yan Li
S. Wei
Enlu Zhou
T. Zhao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization"
12 / 12 papers shown
Title
Stochastic Gradient Descent Jittering for Inverse Problems: Alleviating the Accuracy-Robustness Tradeoff
Peimeng Guan
Mark A. Davenport
28
0
0
18 Oct 2024
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Zeman Li
Xinwei Zhang
Peilin Zhong
Yuan Deng
Meisam Razaviyayn
Vahab Mirrokni
20
2
0
09 Oct 2024
On subdifferential chain rule of matrix factorization and beyond
Jiewen Guan
Anthony Man-Cho So
AI4CE
23
1
0
07 Oct 2024
Demystifying SGD with Doubly Stochastic Gradients
Kyurae Kim
Joohwan Ko
Yian Ma
Jacob R. Gardner
48
0
0
03 Jun 2024
Uniform-in-time propagation of chaos for mean field Langevin dynamics
Fan Chen
Zhenjie Ren
Song-bo Wang
35
30
0
06 Dec 2022
How Much Data Are Augmentations Worth? An Investigation into Scaling Laws, Invariance, and Implicit Regularization
Jonas Geiping
Micah Goldblum
Gowthami Somepalli
Ravid Shwartz-Ziv
Tom Goldstein
A. Wilson
21
35
0
12 Oct 2022
Explicit Regularization in Overparametrized Models via Noise Injection
Antonio Orvieto
Anant Raj
Hans Kersting
Francis R. Bach
10
26
0
09 Jun 2022
Flat minima generalize for low-rank matrix recovery
Lijun Ding
D. Drusvyatskiy
Maryam Fazel
Zaid Harchaoui
26
16
0
07 Mar 2022
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Yuqing Wang
Minshuo Chen
T. Zhao
Molei Tao
AI4CE
55
40
0
07 Oct 2021
Statistical limits of dictionary learning: random matrix theory and the spectral replica method
Jean Barbier
N. Macris
33
24
0
14 Sep 2021
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization
Tian-Chun Ye
S. Du
19
46
0
27 Jun 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,886
0
15 Sep 2016
1