Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.04084
Cited By
Provably learning a multi-head attention layer
6 February 2024
Sitan Chen
Yuanzhi Li
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Provably learning a multi-head attention layer"
2 / 2 papers shown
Title
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
46
0
0
02 May 2025
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
Renpu Liu
Ruida Zhou
Cong Shen
Jing Yang
23
0
0
17 Oct 2024
1