Breaking through the learning plateaus of in-context learning in Transformer

12 September 2023

Papers citing "Breaking through the learning plateaus of in-context learning in Transformer"

5 / 5 papers shown

Title
The Transient Nature of Emergent In-Context Learning in Transformers Aaditya K. Singh Stephanie C. Y. Chan Ted Moskovitz Erin Grant Andrew M. Saxe Felix Hill 62 31 0 14 Nov 2023
Transformers generalize differently from information stored in context vs in weights Stephanie C. Y. Chan Ishita Dasgupta Junkyung Kim D. Kumaran Andrew Kyle Lampinen Felix Hill 98 45 0 11 Oct 2022
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 240 453 0 24 Sep 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 245 1,977 0 31 Dec 2020
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives Elena Voita Rico Sennrich Ivan Titov 188 181 0 03 Sep 2019