Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.07677
Cited By
Transformers learn in-context by gradient descent
15 December 2022
J. Oswald
Eyvind Niklasson
E. Randazzo
João Sacramento
A. Mordvintsev
A. Zhmoginov
Max Vladymyrov
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformers learn in-context by gradient descent"
50 / 71 papers shown
Title
Understanding In-context Learning of Addition via Activation Subspaces
Xinyan Hu
Kayo Yin
Michael I. Jordan
Jacob Steinhardt
Lijie Chen
44
0
0
08 May 2025
Rethinking Invariance in In-context Learning
Lizhe Fang
Yifei Wang
Khashayar Gatmiry
Lei Fang
Y. Wang
44
1
0
08 May 2025
On the generalization of language models from in-context learning and finetuning: a controlled study
Andrew Kyle Lampinen
Arslan Chaudhry
Stephanie Chan
Cody Wild
Diane Wan
Alex Ku
Jorg Bornschein
Razvan Pascanu
Murray Shanahan
James L. McClelland
46
0
0
01 May 2025
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks
Vishnu Sarukkai
Zhiqiang Xie
Kayvon Fatahalian
LLMAG
68
0
0
01 May 2025
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Hongkang Li
Yihua Zhang
Shuai Zhang
M. Wang
Sijia Liu
Pin-Yu Chen
MoMe
57
2
0
15 Apr 2025
An extension of linear self-attention for in-context learning
Katsuyuki Hagiwara
34
0
0
31 Mar 2025
L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
Simeng Sun
Cheng-Ping Hsieh
Faisal Ladhak
Erik Arakelyan
Santiago Akle Serano
Boris Ginsburg
ReLM
ELM
LRM
59
0
0
28 Mar 2025
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning
Anh Tong
Thanh Nguyen-Tang
Dongeun Lee
Duc Nguyen
Toan M. Tran
David Hall
Cheongwoong Kang
Jaesik Choi
33
0
0
03 Mar 2025
In-Context Learning with Hypothesis-Class Guidance
Ziqian Lin
Shubham Kumar Bharti
Kangwook Lee
66
0
0
27 Feb 2025
Vector-ICL: In-context Learning with Continuous Vector Representations
Yufan Zhuang
Chandan Singh
Liyuan Liu
Jingbo Shang
Jianfeng Gao
52
3
0
21 Feb 2025
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
89
18
0
21 Feb 2025
CoT-ICL Lab: A Petri Dish for Studying Chain-of-Thought Learning from In-Context Demonstrations
Vignesh Kothapalli
Hamed Firooz
Maziar Sanjabi
57
0
0
21 Feb 2025
Zero-shot Model-based Reinforcement Learning using Large Language Models
Abdelhakim Benechehab
Youssef Attia El Hili
Ambroise Odonnat
Oussama Zekri
Albert Thomas
Giuseppe Paolo
Maurizio Filippone
I. Redko
Balázs Kégl
OffRL
62
1
0
17 Feb 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Yutong Yin
Zhaoran Wang
LRM
ReLM
56
0
0
27 Jan 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
78
4
0
31 Dec 2024
ICLR: In-Context Learning of Representations
Core Francisco Park
Andrew Lee
Ekdeep Singh Lubana
Yongyi Yang
Maya Okawa
Kento Nishi
Martin Wattenberg
Hidenori Tanaka
AIFin
111
3
0
29 Dec 2024
Revisiting In-Context Learning with Long Context Language Models
Jinheon Baek
Sun Jae Lee
Prakhar Gupta
Geunseob
Oh
Siddharth Dalmia
101
0
0
22 Dec 2024
Toward Understanding In-context vs. In-weight Learning
Bryan Chan
Xinyi Chen
András Gyorgy
Dale Schuurmans
65
3
0
30 Oct 2024
In-context learning and Occam's razor
Eric Elmoznino
Tom Marty
Tejas Kasetty
Léo Gagnon
Sarthak Mittal
Mahan Fathi
Dhanya Sridhar
Guillaume Lajoie
32
1
0
17 Oct 2024
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
Renpu Liu
Ruida Zhou
Cong Shen
Jing Yang
23
0
0
17 Oct 2024
Context-Scaling versus Task-Scaling in In-Context Learning
Amirhesam Abedsoltan
Adityanarayanan Radhakrishnan
Jingfeng Wu
M. Belkin
ReLM
LRM
32
3
0
16 Oct 2024
State-space models can learn in-context by gradient descent
Neeraj Mohan Sushma
Yudou Tian
Harshvardhan Mestha
Nicolo Colombo
David Kappel
Anand Subramoney
35
3
0
15 Oct 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
80
19
0
15 Oct 2024
ELICIT: LLM Augmentation via External In-Context Capability
Futing Wang
Jianhao Yan
Yue Zhang
Tao Lin
35
0
0
12 Oct 2024
Zero-Shot Learning of Causal Models
Divyat Mahajan
Jannes Gladrow
Agrin Hilmkil
Cheng Zhang
M. Scetbon
28
1
0
08 Oct 2024
Density estimation with LLMs: a geometric investigation of in-context learning trajectories
Toni J. B. Liu
Nicolas Boullé
Raphael Sarfati
Christopher Earls
20
0
0
07 Oct 2024
In-context Learning in Presence of Spurious Correlations
Hrayr Harutyunyan
R. Darbinyan
Samvel Karapetyan
Hrant Khachatrian
LRM
30
1
0
04 Oct 2024
Towards Understanding the Universality of Transformers for Next-Token Prediction
Michael E. Sander
Gabriel Peyré
CML
29
0
0
03 Oct 2024
Mitigating Copy Bias in In-Context Learning through Neuron Pruning
Ameen Ali
Lior Wolf
Ivan Titov
27
2
0
02 Oct 2024
Racing Thoughts: Explaining Contextualization Errors in Large Language Models
Michael A. Lepori
Michael Mozer
Asma Ghandeharioun
LRM
80
1
0
02 Oct 2024
Transformers Handle Endogeneity in In-Context Linear Regression
Haodong Liang
Krishnakumar Balasubramanian
Lifeng Lai
32
1
0
02 Oct 2024
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
Kevin Xu
Issei Sato
37
3
0
02 Oct 2024
Attention layers provably solve single-location regression
P. Marion
Raphael Berthier
Gérard Biau
Claire Boyer
57
2
0
02 Oct 2024
Spin glass model of in-context learning
Yuhao Li
Ruoran Bai
Haiping Huang
LRM
37
0
0
05 Aug 2024
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
56
1
0
15 Jul 2024
Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons
Yongqi Leng
Deyi Xiong
32
5
0
09 Jul 2024
Expressivity of Neural Networks with Random Weights and Learned Biases
Ezekiel Williams
Avery Hee-Woon Ryoo
Thomas Jiralerspong
Alexandre Payeur
M. Perich
Luca Mazzucato
Guillaume Lajoie
26
1
0
01 Jul 2024
On the Transformations across Reward Model, Parameter Update, and In-Context Prompt
Deng Cai
Huayang Li
Tingchen Fu
Siheng Li
Weiwen Xu
...
Leyang Cui
Yan Wang
Lemao Liu
Taro Watanabe
Shuming Shi
KELM
26
2
0
24 Jun 2024
Distributed Rule Vectors is A Key Mechanism in Large Language Models' In-Context Learning
Bowen Zheng
Ming Ma
Zhongqiao Lin
Tianming Yang
23
1
0
23 Jun 2024
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
Franz Nowak
Anej Svete
Alexandra Butoi
Ryan Cotterell
ReLM
LRM
44
12
0
20 Jun 2024
Towards Better Understanding of In-Context Learning Ability from In-Context Uncertainty Quantification
Shang Liu
Zhongze Cai
Guanting Chen
Xiaocheng Li
UQCV
38
1
0
24 May 2024
Implicit In-context Learning
Zhuowei Li
Zihao Xu
Ligong Han
Yunhe Gao
Song Wen
Di Liu
Hao Wang
Dimitris N. Metaxas
38
1
0
23 May 2024
Asymptotic theory of in-context learning by linear attention
Yue M. Lu
Mary I. Letey
Jacob A. Zavatone-Veth
Anindita Maiti
C. Pehlevan
19
10
0
20 May 2024
In-Context Learning with Long-Context Models: An In-Depth Exploration
Amanda Bertsch
Maor Ivgi
Uri Alon
Jonathan Berant
Matthew R. Gormley
Matthew R. Gormley
Graham Neubig
ReLM
AIMat
81
65
0
30 Apr 2024
LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History
Akash Gupta
Ivaxi Sheth
Vyas Raina
Mark J. F. Gales
Mario Fritz
30
4
0
28 Feb 2024
Linear Transformers are Versatile In-Context Learners
Max Vladymyrov
J. Oswald
Mark Sandler
Rong Ge
24
13
0
21 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
24
13
0
08 Feb 2024
An Information-Theoretic Analysis of In-Context Learning
Hong Jun Jeon
Jason D. Lee
Qi Lei
Benjamin Van Roy
15
18
0
28 Jan 2024
Setting the Record Straight on Transformer Oversmoothing
G. Dovonon
M. Bronstein
Matt J. Kusner
20
5
0
09 Jan 2024
The mechanistic basis of data dependence and abrupt learning in an in-context classification task
Gautam Reddy
19
48
0
03 Dec 2023
1
2
Next