Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.01905
Cited By
Convergence of Gradient Descent on Separable Data
5 March 2018
Mor Shpigel Nacson
Jason D. Lee
Suriya Gunasekar
Pedro H. P. Savarese
Nathan Srebro
Daniel Soudry
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Convergence of Gradient Descent on Separable Data"
22 / 22 papers shown
Title
Embedding principle of homogeneous neural network for classification problem
Jiahan Zhang
Yaoyu Zhang
Yaoyu Zhang
27
0
0
18 May 2025
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
78
0
0
02 May 2025
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
43
0
0
05 Apr 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
110
10
0
20 Feb 2025
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J.N. Zhang
MoMe
FedML
186
2
0
18 Feb 2025
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks
Sholom Schechtman
Nicolas Schreuder
362
0
0
08 Feb 2025
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
44
18
0
08 Feb 2024
Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study
Assaf Dauber
M. Feder
Tomer Koren
Roi Livni
29
24
0
13 Mar 2020
Gradient descent aligns the layers of deep linear networks
Ziwei Ji
Matus Telgarsky
91
250
0
04 Oct 2018
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models?
Tengyu Xu
Yi Zhou
Kaiyi Ji
Yingbin Liang
44
19
0
12 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Mor Shpigel Nacson
Nathan Srebro
Daniel Soudry
FedML
MLT
51
100
0
05 Jun 2018
Implicit Bias of Gradient Descent on Linear Convolutional Networks
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
MDE
45
408
0
01 Jun 2018
Risk and parameter convergence of logistic regression
Ziwei Ji
Matus Telgarsky
28
129
0
20 Mar 2018
Characterizing Implicit Bias in Terms of Optimization Geometry
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
AI4CE
57
404
0
22 Feb 2018
The Implicit Bias of Gradient Descent on Separable Data
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
60
908
0
27 Oct 2017
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
134
798
0
24 May 2017
The Power of Normalization: Faster Evasion of Saddle Points
Kfir Y. Levy
49
108
0
15 Nov 2016
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
245
4,612
0
10 Nov 2016
Wide Residual Networks
Sergey Zagoruyko
N. Komodakis
231
7,951
0
23 May 2016
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
AI4CE
55
655
0
20 Dec 2014
Margins, Shrinkage, and Boosting
Matus Telgarsky
44
73
0
18 Mar 2013
Sublinear Optimization for Machine Learning
K. Clarkson
Elad Hazan
David P. Woodruff
50
138
0
21 Oct 2010
1