Trained Transformers Learn Linear Models In-Context

16 June 2023

Papers citing "Trained Transformers Learn Linear Models In-Context"

36 / 36 papers shown

Title
Understanding In-context Learning of Addition via Activation Subspaces Xinyan Hu Kayo Yin Michael I. Jordan Jacob Steinhardt Lijie Chen 44 0 0 08 May 2025
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias Ruiquan Huang Yingbin Liang Jing Yang 46 0 0 02 May 2025
On the generalization of language models from in-context learning and finetuning: a controlled study Andrew Kyle Lampinen Arslan Chaudhry Stephanie Chan Cody Wild Diane Wan Alex Ku Jorg Bornschein Razvan Pascanu Murray Shanahan James L. McClelland 46 0 0 01 May 2025
ICL CIPHERS: Quantifying "Learning'' in In-Context Learning via Substitution Ciphers Zhouxiang Fang Aayush Mishra Muhan Gao Anqi Liu Daniel Khashabi 44 0 0 28 Apr 2025
How Private is Your Attention? Bridging Privacy with In-Context Learning Soham Bonnerjee Zhen Wei Yeon Anna Asch Sagnik Nandy Promit Ghosal 40 0 0 22 Apr 2025
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers Hongkang Li Yihua Zhang Shuai Zhang M. Wang Sijia Liu Pin-Yu Chen MoMe 57 2 0 15 Apr 2025
An extension of linear self-attention for in-context learning Katsuyuki Hagiwara 34 0 0 31 Mar 2025
In-Context Learning with Hypothesis-Class Guidance Ziqian Lin Shubham Kumar Bharti Kangwook Lee 69 0 0 27 Feb 2025
Vector-ICL: In-context Learning with Continuous Vector Representations Yufan Zhuang Chandan Singh Liyuan Liu Jingbo Shang Jianfeng Gao 52 3 0 21 Feb 2025
Tensor Product Attention Is All You Need Yifan Zhang Yifeng Liu Huizhuo Yuan Zhen Qin Yang Yuan Q. Gu Andrew Chi-Chih Yao 75 9 0 11 Jan 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers Jiajun Song Zhuoyan Xu Yiqiao Zhong 78 4 0 31 Dec 2024
LaVin-DiT: Large Vision Diffusion Transformer Zhaoqing Wang Xiaobo Xia Runnan Chen Dongdong Yu Changhu Wang M. Gong Tongliang Liu 92 6 0 18 Nov 2024
All or None: Identifiable Linear Properties of Next-token Predictors in Language Modeling Emanuele Marconato Sébastien Lachapelle Sebastian Weichwald Luigi Gresele 61 3 0 30 Oct 2024
Toward Understanding In-context vs. In-weight Learning Bryan Chan Xinyi Chen András Gyorgy Dale Schuurmans 65 3 0 30 Oct 2024
In-context learning and Occam's razor Eric Elmoznino Tom Marty Tejas Kasetty Léo Gagnon Sarthak Mittal Mahan Fathi Dhanya Sridhar Guillaume Lajoie 32 1 0 17 Oct 2024
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery Renpu Liu Ruida Zhou Cong Shen Jing Yang 23 0 0 17 Oct 2024
Context-Scaling versus Task-Scaling in In-Context Learning Amirhesam Abedsoltan Adityanarayanan Radhakrishnan Jingfeng Wu M. Belkin ReLM LRM 32 3 0 16 Oct 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent Bo Chen Xiaoyu Li Yingyu Liang Zhenmei Shi Zhao-quan Song 83 19 0 15 Oct 2024
Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong Information Yongheng Zhang Qiguang Chen Jingxuan Zhou Peng Wang Jiasheng Si Jin Wang Wenpeng Lu Libo Qin LRM 44 3 0 06 Oct 2024
Towards Understanding the Universality of Transformers for Next-Token Prediction Michael E. Sander Gabriel Peyré CML 29 0 0 03 Oct 2024
Transformers Handle Endogeneity in In-Context Linear Regression Haodong Liang Krishnakumar Balasubramanian Lifeng Lai 32 1 0 02 Oct 2024
Attention layers provably solve single-location regression P. Marion Raphael Berthier Gérard Biau Claire Boyer 57 2 0 02 Oct 2024
Spin glass model of in-context learning Yuhao Li Ruoran Bai Haiping Huang LRM 37 0 0 05 Aug 2024
Representing Rule-based Chatbots with Transformers Dan Friedman Abhishek Panigrahi Danqi Chen 56 1 0 15 Jul 2024
On Understanding Attention-Based In-Context Learning for Categorical Data Aaron T. Wang William Convertino Xiang Cheng Ricardo Henao Lawrence Carin 35 0 0 27 May 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers Lorenzo Tiberi Francesca Mignacco Kazuki Irie H. Sompolinsky 42 5 0 24 May 2024
Towards Better Understanding of In-Context Learning Ability from In-Context Uncertainty Quantification Shang Liu Zhongze Cai Guanting Chen Xiaocheng Li UQCV 38 1 0 24 May 2024
Asymptotic theory of in-context learning by linear attention Yue M. Lu Mary I. Letey Jacob A. Zavatone-Veth Anindita Maiti C. Pehlevan 19 10 0 20 May 2024
Linear Transformers are Versatile In-Context Learners Max Vladymyrov J. Oswald Mark Sandler Rong Ge 24 13 0 21 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention Bhavya Vasudeva Puneesh Deora Christos Thrampoulidis 24 13 0 08 Feb 2024
An Information-Theoretic Analysis of In-Context Learning Hong Jun Jeon Jason D. Lee Qi Lei Benjamin Van Roy 15 18 0 28 Jan 2024
Setting the Record Straight on Transformer Oversmoothing G. Dovonon M. Bronstein Matt J. Kusner 20 5 0 09 Jan 2024
Transformers are Provably Optimal In-context Estimators for Wireless Communications Vishnu Teja Kunde Vicram Rajagopalan Chandra Shekhara Kaushik Valmeekam Krishna R. Narayanan S. Shakkottai D. Kalathil J. Chamberland 29 4 0 01 Nov 2023
Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining Licong Lin Yu Bai Song Mei OffRL 30 42 0 12 Oct 2023
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding Yuchen Li Yuan-Fang Li Andrej Risteski 107 61 0 07 Mar 2023
HumanMAC: Masked Motion Completion for Human Motion Prediction Ling-Hao Chen Jiawei Zhang Ye-rong Li Yiren Pang Xiaobo Xia Tongliang Liu DiffM VGen 30 54 0 07 Feb 2023