Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.15807
Cited By
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
21 May 2025
Patrick Kahardipraja
Reduan Achtibat
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation"
31 / 31 papers shown
Title
Which Attention Heads Matter for In-Context Learning?
Kayo Yin
Jacob Steinhardt
40
10
0
19 Feb 2025
Controllable Context Sensitivity and the Knob Behind It
Julian Minder
Kevin Du
Niklas Stoehr
Giovanni Monea
Chris Wendler
Robert West
Ryan Cotterell
KELM
75
5
0
11 Nov 2024
Knowledge Conflicts for LLMs: A Survey
Rongwu Xu
Zehan Qi
Zhijiang Guo
Cunxiang Wang
Hongru Wang
Yue Zhang
Wei Xu
229
104
0
13 Mar 2024
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
Reduan Achtibat
Sayed Mohammad Vakilzadeh Hatefi
Maximilian Dreyer
Aakriti Jain
Thomas Wiegand
Sebastian Lapuschkin
Wojciech Samek
47
31
0
08 Feb 2024
Characterizing Mechanisms for Factual Recall in Language Models
Qinan Yu
Jack Merullo
Ellie Pavlick
KELM
69
28
0
24 Oct 2023
Function Vectors in Large Language Models
Eric Todd
Millicent Li
Arnab Sen Sharma
Aaron Mueller
Byron C. Wallace
David Bau
43
112
0
23 Oct 2023
Enabling Large Language Models to Generate Text with Citations
Tianyu Gao
Howard Yen
Jiatong Yu
Danqi Chen
LM&MA
HILM
67
336
0
24 May 2023
Long Time No See! Open-Domain Conversation with Long-Term Persona Memory
Xinchao Xu
Zhibin Gou
Wenquan Wu
Zheng-Yu Niu
Hua Wu
Haifeng Wang
Shihang Wang
RALM
38
111
0
11 Mar 2022
Measuring the Mixing of Contextual Information in the Transformer
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
50
51
0
08 Mar 2022
Improving language models by retrieving from trillions of tokens
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
...
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
KELM
RALM
160
1,069
0
08 Dec 2021
Entity-Based Knowledge Conflicts in Question Answering
Shayne Longpre
Kartik Perisetla
Anthony Chen
Nikhil Ramesh
Chris DuBois
Sameer Singh
HILM
294
249
0
10 Sep 2021
Knowledge Neurons in Pretrained Transformers
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELM
MU
60
440
0
18 Apr 2021
Retrieval Augmentation Reduces Hallucination in Conversation
Kurt Shuster
Spencer Poff
Moya Chen
Douwe Kiela
Jason Weston
HILM
69
721
0
15 Apr 2021
What Makes Good In-Context Examples for GPT-
3
3
3
?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAML
RALM
324
1,343
0
17 Jan 2021
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
Jasmijn Bastings
Katja Filippova
XAI
LRM
68
177
0
12 Oct 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
498
41,106
0
28 May 2020
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
86
2,050
0
10 Feb 2020
Understanding and Improving Layer Normalization
Jingjing Xu
Xu Sun
Zhiyuan Zhang
Guangxiang Zhao
Junyang Lin
FAtt
79
346
0
16 Nov 2019
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation
Yizhe Zhang
Siqi Sun
Michel Galley
Yen-Chun Chen
Chris Brockett
Xiang Gao
Jianfeng Gao
Jingjing Liu
W. Dolan
VLM
141
1,515
0
01 Nov 2019
Generalization through Memorization: Nearest Neighbor Language Models
Urvashi Khandelwal
Omer Levy
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
128
837
0
01 Nov 2019
MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension
Adam Fisch
Alon Talmor
Robin Jia
Minjoon Seo
Eunsol Choi
Danqi Chen
52
304
0
22 Oct 2019
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
531
2,639
0
03 Sep 2019
Attention is not not Explanation
Sarah Wiegreffe
Yuval Pinter
XAI
AAML
FAtt
46
901
0
13 Aug 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
76
1,120
0
23 May 2019
Attention is not Explanation
Sarthak Jain
Byron C. Wallace
FAtt
87
1,307
0
26 Feb 2019
The Natural Language Decathlon: Multitask Learning as Question Answering
Bryan McCann
N. Keskar
Caiming Xiong
R. Socher
AIMat
MLLM
BDL
82
642
0
20 Jun 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
443
129,831
0
12 Jun 2017
A Unified Approach to Interpreting Model Predictions
Scott M. Lundberg
Su-In Lee
FAtt
538
21,613
0
22 May 2017
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
173
2,610
0
09 May 2017
Axiomatic Attribution for Deep Networks
Mukund Sundararajan
Ankur Taly
Qiqi Yan
OOD
FAtt
115
5,920
0
04 Mar 2017
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
280
20,491
0
10 Sep 2014
1