How do Transformers perform In-Context Autoregressive Learning?

8 February 2024

Papers citing "How do Transformers perform In-Context Autoregressive Learning?"

2 / 2 papers shown

Title
Towards Understanding the Universality of Transformers for Next-Token Prediction Michael E. Sander Gabriel Peyré CML 31 0 0 03 Oct 2024
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation Ofir Press Noah A. Smith M. Lewis 242 695 0 27 Aug 2021