Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.00595
Cited By
v1
v2 (latest)
Fast Margin Maximization via Dual Acceleration
1 July 2021
Ziwei Ji
Nathan Srebro
Matus Telgarsky
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Fast Margin Maximization via Dual Acceleration"
29 / 29 papers shown
Title
Constant Stepsize Local GD for Logistic Regression: Acceleration by Instability
M. Crawshaw
Blake Woodworth
Mingrui Liu
17
0
0
16 Jun 2025
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Bhavya Vasudeva
Jung Whan Lee
Vatsal Sharan
Mahdi Soltanolkotabi
15
0
0
29 May 2025
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
120
0
0
02 May 2025
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
83
2
0
05 Apr 2025
The Implicit Bias of Gradient Descent on Separable Multiclass Data
Hrithik Ravi
Clayton Scott
Daniel Soudry
Yutong Wang
110
4
0
02 Nov 2024
Non-asymptotic Convergence of Training Transformers for Next-token Prediction
Ruiquan Huang
Yingbin Liang
Jing Yang
88
7
0
25 Sep 2024
Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization
Yuhang Cai
Jingfeng Wu
Song Mei
Michael Lindsey
Peter L. Bartlett
91
4
0
12 Jun 2024
Improving Generalization and Convergence by Enhancing Implicit Regularization
Mingze Wang
Haotian He
Jinbo Wang
Zilin Wang
Guanhua Huang
Feiyu Xiong
Zhiyu Li
E. Weinan
Lei Wu
96
8
0
31 May 2024
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
Heejune Sheen
Siyu Chen
Tianhao Wang
Harrison H. Zhou
MLT
87
13
0
13 Mar 2024
Transformers Learn Low Sensitivity Functions: Investigations and Implications
Bhavya Vasudeva
Deqing Fu
Tianyi Zhou
Elliott Kau
Youqi Huang
Vatsal Sharan
101
4
0
11 Mar 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
104
21
0
08 Feb 2024
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Mingze Wang
Zeping Min
Lei Wu
84
3
0
24 Nov 2023
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Yeqi Gao
Zhao Song
Weixin Wang
Junze Yin
111
29
0
14 Sep 2023
Transformers as Support Vector Machines
Davoud Ataee Tarzanagh
Yingcong Li
Christos Thrampoulidis
Samet Oymak
133
49
0
31 Aug 2023
A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Haoyuan Sun
Khashayar Gatmiry
Kwangjun Ahn
Navid Azizan
AI4CE
74
13
0
24 Jun 2023
Max-Margin Token Selection in Attention Mechanism
Davoud Ataee Tarzanagh
Yingcong Li
Xuechen Zhang
Samet Oymak
107
45
0
23 Jun 2023
Faster Margin Maximization Rates for Generic and Adversarially Robust Optimization Methods
Guanghui Wang
Zihao Hu
Claudio Gentile
Vidya Muthukumar
Jacob D. Abernethy
99
0
0
27 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu
Vladimir Braverman
Jason D. Lee
65
21
0
19 May 2023
Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
84
23
0
02 Mar 2023
Iterative regularization in classification via hinge loss diagonal descent
Vassilis Apidopoulos
T. Poggio
Lorenzo Rosasco
S. Villa
61
2
0
24 Dec 2022
On Accelerated Perceptrons and Beyond
Guanghui Wang
Rafael Hanashiro
E. Guha
Jacob D. Abernethy
73
7
0
17 Oct 2022
On Generalization of Decentralized Learning with Separable Data
Hossein Taheri
Christos Thrampoulidis
FedML
84
11
0
15 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
91
81
0
26 Aug 2022
Kernel Memory Networks: A Unifying Framework for Memory Modeling
Georgios Iatropoulos
Johanni Brea
W. Gerstner
50
10
0
19 Aug 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
100
29
0
08 Jul 2022
Mirror Descent Maximizes Generalized Margin and Can Be Implemented Efficiently
Haoyuan Sun
Kwangjun Ahn
Christos Thrampoulidis
Navid Azizan
OOD
56
22
0
25 May 2022
On the Optimization of Margin Distribution
Meng-Zhang Qian
Zheng Ai
Teng Zhang
Wei Gao
18
1
0
29 Apr 2022
Does Momentum Change the Implicit Regularization on Separable Data?
Bohan Wang
Qi Meng
Huishuai Zhang
Ruoyu Sun
Wei Chen
Zhirui Ma
Tie-Yan Liu
99
18
0
08 Oct 2021
Properties of the After Kernel
Philip M. Long
66
29
0
21 May 2021
1