Which Attention Heads Matter for In-Context Learning?

19 February 2025

Papers citing "Which Attention Heads Matter for In-Context Learning?"

10 / 10 papers shown

Title
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation Patrick Kahardipraja Reduan Achtibat Thomas Wiegand Wojciech Samek Sebastian Lapuschkin 86 0 0 21 May 2025
Mechanistic evaluation of Transformers and state space models Aryaman Arora Neil Rathi Nikil Roashan Selvam Róbert Csordás Dan Jurafsky Christopher Potts 63 0 0 21 May 2025
Do different prompting methods yield a common task representation in language models? Guy Davidson Todd M. Gureckis Brenden M. Lake Adina Williams 50 0 0 17 May 2025
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction Jeffrey Willette Heejun Lee Sung Ju Hwang 52 0 0 16 May 2025
Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning Jingcheng Niu Subhabrata Dutta Ahmed Elshabrawy Harish Tayyar Madabushi Iryna Gurevych 86 1 0 16 May 2025
Understanding In-context Learning of Addition via Activation Subspaces Xinyan Hu Kayo Yin Michael I. Jordan Jacob Steinhardt Lijie Chen 105 0 0 08 May 2025
Page Classification for Print Imaging Pipeline Shaoyuan Xu Cheng Lu Mark Shaw Peter Bauer J. Allebach VLM 61 0 0 03 Apr 2025
Repetitions are not all alike: distinct mechanisms sustain repetition in language models Matéo Mahaut Francesca Franzon 68 0 0 01 Apr 2025
Focus Directions Make Your Language Models Pay More Attention to Relevant Contexts Youxiang Zhu Ruochen Li Danqing Wang Daniel Haehn Xiaohui Liang LRM 89 2 0 30 Mar 2025
Strategy Coopetition Explains the Emergence and Transience of In-Context Learning Aaditya K. Singh Ted Moskovitz Sara Dragutinovic Felix Hill Stephanie C. Y. Chan Andrew Saxe 356 2 0 07 Mar 2025