Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.09221
Cited By
Spectral Filters, Dark Signals, and Attention Sinks
14 February 2024
Nicola Cancedda
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Spectral Filters, Dark Signals, and Attention Sinks"
5 / 5 papers shown
Title
Outlier dimensions favor frequent tokens in language models
Iuri Macocco
Nora Graichen
Gemma Boleda
Marco Baroni
42
0
0
27 Mar 2025
More Expressive Attention with Negative Weights
Ang Lv
Ruobing Xie
Shuaipeng Li
Jiayi Liao
X. Sun
Zhanhui Kang
Di Wang
Rui Yan
30
0
0
11 Nov 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
49
18
0
02 Jul 2024
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
234
453
0
24 Sep 2022
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
117
314
0
21 Sep 2022
1