Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning Ability

12 October 2023

Papers citing "Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning Ability"

3 / 3 papers shown

Title
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation Ofir Press Noah A. Smith M. Lewis 242 690 0 27 Aug 2021
What Makes Good In-Context Examples for GPT- $3$ ? Jiachang Liu Dinghan Shen Yizhe Zhang Bill Dolan Lawrence Carin Weizhu Chen AAML RALM 275 1,296 0 17 Jan 2021
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 226 4,424 0 23 Jan 2020