ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.01905
  4. Cited By
Convergence of Gradient Descent on Separable Data

Convergence of Gradient Descent on Separable Data

5 March 2018
Mor Shpigel Nacson
Jason D. Lee
Suriya Gunasekar
Pedro H. P. Savarese
Nathan Srebro
Daniel Soudry
ArXivPDFHTML

Papers citing "Convergence of Gradient Descent on Separable Data"

22 / 22 papers shown
Title
Embedding principle of homogeneous neural network for classification problem
Embedding principle of homogeneous neural network for classification problem
Jiahan Zhang
Yaoyu Zhang
Yaoyu Zhang
29
0
0
18 May 2025
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
80
0
0
02 May 2025
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
45
0
0
05 Apr 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
120
10
0
20 Feb 2025
Scalable Model Merging with Progressive Layer-wise Distillation
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J.N. Zhang
MoMe
FedML
197
2
0
18 Feb 2025
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks
Sholom Schechtman
Nicolas Schreuder
368
0
0
08 Feb 2025
Implicit Bias and Fast Convergence Rates for Self-attention
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
44
18
0
08 Feb 2024
Can Implicit Bias Explain Generalization? Stochastic Convex Optimization
  as a Case Study
Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study
Assaf Dauber
M. Feder
Tomer Koren
Roi Livni
32
24
0
13 Mar 2020
Gradient descent aligns the layers of deep linear networks
Gradient descent aligns the layers of deep linear networks
Ziwei Ji
Matus Telgarsky
97
250
0
04 Oct 2018
When Will Gradient Methods Converge to Max-margin Classifier under ReLU
  Models?
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models?
Tengyu Xu
Yi Zhou
Kaiyi Ji
Yingbin Liang
47
19
0
12 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a
  Fixed Learning Rate
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Mor Shpigel Nacson
Nathan Srebro
Daniel Soudry
FedML
MLT
51
100
0
05 Jun 2018
Implicit Bias of Gradient Descent on Linear Convolutional Networks
Implicit Bias of Gradient Descent on Linear Convolutional Networks
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
MDE
45
408
0
01 Jun 2018
Risk and parameter convergence of logistic regression
Risk and parameter convergence of logistic regression
Ziwei Ji
Matus Telgarsky
28
129
0
20 Mar 2018
Characterizing Implicit Bias in Terms of Optimization Geometry
Characterizing Implicit Bias in Terms of Optimization Geometry
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
AI4CE
60
404
0
22 Feb 2018
The Implicit Bias of Gradient Descent on Separable Data
The Implicit Bias of Gradient Descent on Separable Data
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
60
908
0
27 Oct 2017
Train longer, generalize better: closing the generalization gap in large
  batch training of neural networks
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
140
798
0
24 May 2017
The Power of Normalization: Faster Evasion of Saddle Points
The Power of Normalization: Faster Evasion of Saddle Points
Kfir Y. Levy
49
108
0
15 Nov 2016
Understanding deep learning requires rethinking generalization
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
260
4,612
0
10 Nov 2016
Wide Residual Networks
Wide Residual Networks
Sergey Zagoruyko
N. Komodakis
249
7,951
0
23 May 2016
In Search of the Real Inductive Bias: On the Role of Implicit
  Regularization in Deep Learning
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
AI4CE
58
655
0
20 Dec 2014
Margins, Shrinkage, and Boosting
Margins, Shrinkage, and Boosting
Matus Telgarsky
44
73
0
18 Mar 2013
Sublinear Optimization for Machine Learning
Sublinear Optimization for Machine Learning
K. Clarkson
Elad Hazan
David P. Woodruff
57
138
0
21 Oct 2010
1