Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.05459
Cited By
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
7 October 2024
Kaiyue Wen
Huaqing Zhang
Hongzhou Lin
Jingzhao Zhang
MoE
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency"
1 / 1 papers shown
Title
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
40
0
0
02 May 2025
1