Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.08699
Cited By
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
13 March 2024
Heejune Sheen
Siyu Chen
Tianhao Wang
Harrison H. Zhou
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Implicit Regularization of Gradient Flow on One-Layer Softmax Attention"
12 / 12 papers shown
Title
Revisiting Transformers through the Lens of Low Entropy and Dynamic Sparsity
Ruifeng Ren
Yong Liu
39
0
0
26 Apr 2025
Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?
Tom Jacobs
Chao Zhou
R. Burkholz
OffRL
AI4CE
23
0
0
17 Apr 2025
Gating is Weighting: Understanding Gated Linear Attention through In-context Learning
Yingcong Li
Davoud Ataee Tarzanagh
A. S. Rawat
Maryam Fazel
Samet Oymak
23
0
0
06 Apr 2025
Training Dynamics of In-Context Learning in Linear Attention
Yedi Zhang
Aaditya K. Singh
Peter E. Latham
Andrew Saxe
MLT
59
1
0
28 Jan 2025
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems
Bingcong Li
Liang Zhang
Niao He
36
3
0
18 Oct 2024
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Kaiyue Wen
Huaqing Zhang
Hongzhou Lin
Jingzhao Zhang
MoE
LRM
58
2
0
07 Oct 2024
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
37
3
0
19 Aug 2024
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Yibo Jiang
Goutham Rajendran
Pradeep Ravikumar
Bryon Aragam
CLL
KELM
29
6
0
26 Jun 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
24
13
0
08 Feb 2024
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding
Yuchen Li
Yuan-Fang Li
Andrej Risteski
107
61
0
07 Mar 2023
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
83
98
0
13 Oct 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
225
2,404
0
04 Jan 2021
1