Rethinking the Role of Scale for In-Context Learning: An
Interpretability-based Case Study at 66 Billion Scale

Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale

18 December 2022

Karthik Gopalakrishnan

Saket Dingliwal

Katrin Kirchhoff

Papers citing "Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale"

12 / 12 papers shown

Title
Understanding In-context Learning of Addition via Activation Subspaces Xinyan Hu Kayo Yin Michael I. Jordan Jacob Steinhardt Lijie Chen 49 0 0 08 May 2025
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models Daking Rai Yilun Zhou Shi Feng Abulhair Saparov Ziyu Yao 75 18 0 02 Jul 2024
ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models Yash Akhauri Ahmed F. AbouElhamayed Jordan Dotzel Zhiru Zhang Alexander M Rush Safeen Huda Mohamed S. Abdelfattah 18 2 0 24 Jun 2024
Where does In-context Translation Happen in Large Language Models Suzanna Sia David Mueller Kevin Duh LRM 33 0 0 07 Mar 2024
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods Fred Zhang Neel Nanda LLMSV 26 96 0 27 Sep 2023
On the (In)Effectiveness of Large Language Models for Chinese Text Correction Yinghui Li Haojing Huang Shirong Ma Yong-jia Jiang Y. Li F. Zhou Haitao Zheng Qingyu Zhou 29 43 0 18 Jul 2023
A Simple and Effective Pruning Approach for Large Language Models Mingjie Sun Zhuang Liu Anna Bair J. Zico Kolter 50 353 0 20 Jun 2023
Transformers generalize differently from information stored in context vs in weights Stephanie C. Y. Chan Ishita Dasgupta Junkyung Kim D. Kumaran Andrew Kyle Lampinen Felix Hill 100 45 0 11 Oct 2022
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 240 456 0 24 Sep 2022
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity Yao Lu Max Bartolo Alastair Moore Sebastian Riedel Pontus Stenetorp AILaw LRM 277 1,114 0 18 Apr 2021
What Makes Good In-Context Examples for GPT- $3$ ? Jiachang Liu Dinghan Shen Yizhe Zhang Bill Dolan Lawrence Carin Weizhu Chen AAML RALM 275 1,311 0 17 Jan 2021
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 226 4,424 0 23 Jan 2020