Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.04245
Cited By
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding
7 March 2023
Yuchen Li
Yuan-Fang Li
Andrej Risteski
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding"
10 / 10 papers shown
Title
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Hongkang Li
Yihua Zhang
Shuai Zhang
M. Wang
Sijia Liu
Pin-Yu Chen
MoMe
40
2
0
15 Apr 2025
Tracking the Feature Dynamics in LLM Training: A Mechanistic Study
Yang Xu
Y. Wang
Hao Wang
49
1
0
23 Dec 2024
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Kaiyue Wen
Huaqing Zhang
Hongzhou Lin
Jingzhao Zhang
MoE
LRM
42
2
0
07 Oct 2024
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
Xinhao Yao
Hongjin Qian
Xiaolin Hu
Gengze Xu
Yong Liu
Wei Liu
Jian Luan
Bin Wang
31
0
0
03 Oct 2024
Attention layers provably solve single-location regression
P. Marion
Raphael Berthier
Gérard Biau
Claire Boyer
33
2
0
02 Oct 2024
An Information-Theoretic Analysis of In-Context Learning
Hong Jun Jeon
Jason D. Lee
Qi Lei
Benjamin Van Roy
6
18
0
28 Jan 2024
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods
Fred Zhang
Neel Nanda
LLMSV
8
95
0
27 Sep 2023
Learning threshold neurons via the "edge of stability"
Kwangjun Ahn
Sébastien Bubeck
Sinho Chewi
Y. Lee
Felipe Suarez
Yi Zhang
MLT
15
36
0
14 Dec 2022
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
219
291
0
24 Feb 2021
Topic Modeling with Contextualized Word Representation Clusters
Laure Thompson
David M. Mimno
75
69
0
23 Oct 2020
1