Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.01258
Cited By
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
2 February 2024
Juno Kim
Taiji Suzuki
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape"
9 / 9 papers shown
Title
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
46
0
0
02 May 2025
ICL CIPHERS: Quantifying "Learning'' in In-Context Learning via Substitution Ciphers
Zhouxiang Fang
Aayush Mishra
Muhan Gao
Anqi Liu
Daniel Khashabi
44
0
0
28 Apr 2025
In-Context Learning with Hypothesis-Class Guidance
Ziqian Lin
Shubham Kumar Bharti
Kangwook Lee
64
0
0
27 Feb 2025
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
Renpu Liu
Ruida Zhou
Cong Shen
Jing Yang
23
0
0
17 Oct 2024
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Kaiyue Wen
Huaqing Zhang
Hongzhou Lin
Jingzhao Zhang
MoE
LRM
58
2
0
07 Oct 2024
Towards Understanding the Universality of Transformers for Next-Token Prediction
Michael E. Sander
Gabriel Peyré
CML
29
0
0
03 Oct 2024
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
Jingfeng Wu
Difan Zou
Zixiang Chen
Vladimir Braverman
Quanquan Gu
Peter L. Bartlett
116
48
0
12 Oct 2023
Convex Analysis of the Mean Field Langevin Dynamics
Atsushi Nitanda
Denny Wu
Taiji Suzuki
MLT
55
63
0
25 Jan 2022
First-order Methods Almost Always Avoid Saddle Points
J. Lee
Ioannis Panageas
Georgios Piliouras
Max Simchowitz
Michael I. Jordan
Benjamin Recht
ODL
80
82
0
20 Oct 2017
1