ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.07067
  4. Cited By
Transformers as Algorithms: Generalization and Stability in In-context
  Learning
v1v2 (latest)

Transformers as Algorithms: Generalization and Stability in In-context Learning

International Conference on Machine Learning (ICML), 2023
17 January 2023
Yingcong Li
M. E. Ildiz
Dimitris Papailiopoulos
Samet Oymak
ArXiv (abs)PDFHTML

Papers citing "Transformers as Algorithms: Generalization and Stability in In-context Learning"

50 / 86 papers shown
Genomic Next-Token Predictors are In-Context Learners
Genomic Next-Token Predictors are In-Context Learners
Nathan Breslow
Aayush Mishra
Mahler Revsine
Michael C. Schatz
Anqi Liu
Daniel Khashabi
219
0
0
16 Nov 2025
Scaling Laws and In-Context Learning: A Unified Theoretical Framework
Scaling Laws and In-Context Learning: A Unified Theoretical Framework
Sushant Mehta
Ishan Gupta
93
0
0
09 Nov 2025
Optimal Attention Temperature Enhances In-Context Learning under Distribution Shift
Optimal Attention Temperature Enhances In-Context Learning under Distribution Shift
Samet Demir
Zafer Dogan
116
0
0
03 Nov 2025
A Framework for Quantifying How Pre-Training and Context Benefit In-Context Learning
A Framework for Quantifying How Pre-Training and Context Benefit In-Context Learning
Bingqing Song
Jiaxiang Li
Rong Wang
Songtao Lu
Mingyi Hong
112
0
0
26 Oct 2025
Optimality and NP-Hardness of Transformers in Learning Markovian Dynamical Functions
Optimality and NP-Hardness of Transformers in Learning Markovian Dynamical Functions
Yanna Ding
Songtao Lu
Yingdong Lu
T. Nowicki
Jianxi Gao
237
0
0
21 Oct 2025
In-Context Learning Is Provably Bayesian Inference: A Generalization Theory for Meta-Learning
In-Context Learning Is Provably Bayesian Inference: A Generalization Theory for Meta-Learning
Tomoya Wakayama
Taiji Suzuki
UQCVBDL
370
2
0
13 Oct 2025
Fine-Grained Emotion Recognition via In-Context Learning
Fine-Grained Emotion Recognition via In-Context Learning
Zhaochun Ren
Zhou Yang
Chenglong Ye
Haizhou Sun
Chao Chen
Xiaofei Zhu
Xiangwen Liao
116
0
0
08 Oct 2025
Data Selection for Fine-tuning Vision Language Models via Cross Modal Alignment Trajectories
Data Selection for Fine-tuning Vision Language Models via Cross Modal Alignment Trajectories
Nilay Naharas
Dang Nguyen
Nesihan Bulut
M. Bateni
Vahab Mirrokni
Baharan Mirzasoleiman
100
0
0
01 Oct 2025
Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
Blake Bordelon
Mary I. Letey
Cengiz Pehlevan
169
0
0
01 Oct 2025
Pretrain-Test Task Alignment Governs Generalization in In-Context Learning
Pretrain-Test Task Alignment Governs Generalization in In-Context Learning
Mary I. Letey
Jacob A. Zavatone-Veth
Yue M. Lu
Cengiz Pehlevan
125
1
0
30 Sep 2025
Theoretical Bounds for Stable In-Context Learning
Theoretical Bounds for Stable In-Context Learning
Tongxi Wang
Zhuoyang Xia
120
0
0
25 Sep 2025
On Theoretical Interpretations of Concept-Based In-Context Learning
On Theoretical Interpretations of Concept-Based In-Context Learning
Huaze Tang
Tianren Peng
Shao-Lun Huang
193
0
0
25 Sep 2025
Decoupled-Value Attention for Prior-Data Fitted Networks: GP Inference for Physical Equations
Decoupled-Value Attention for Prior-Data Fitted Networks: GP Inference for Physical Equations
Kaustubh Sharma
Simardeep Singh
Parikshit Pareek
128
0
0
25 Sep 2025
KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning
KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning
Vaibhav Singh
Soumya Suvra Ghosal
Kapu Nirmal Joshua
Soumyabrata Pal
Sayak Ray Chowdhury
104
0
0
19 Sep 2025
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
Akshit Sinha
Arvindh Arun
Shashwat Goel
Steffen Staab
Jonas Geiping
ALMLRM
285
8
0
11 Sep 2025
Observational Multiplicity
Observational Multiplicity
Erin E. George
Deanna Needell
Berk Ustun
116
0
0
30 Jul 2025
Tuning without Peeking: Provable Generalization Bounds and Robust LLM Post-Training
Tuning without Peeking: Provable Generalization Bounds and Robust LLM Post-Training
Ismail Labiad
Mathurin Videau
Matthieu Kowalski
Marc Schoenauer
Alessandro Leite
Julia Kempe
O. Teytaud
AAML
276
0
0
02 Jul 2025
CooT: Learning to Coordinate In-Context with Coordination Transformers
CooT: Learning to Coordinate In-Context with Coordination Transformers
Huai-Chih Wang
Hsiang-Chun Chuang
Hsi-Chun Cheng
Dai-Jie Wu
Shao-Hua Sun
OffRL
149
0
0
30 Jun 2025
When and How Unlabeled Data Provably Improve In-Context Learning
When and How Unlabeled Data Provably Improve In-Context Learning
Yingcong Li
Xiangyu Chang
Muti Kara
Xiaofeng Liu
Amit K. Roy-Chowdhury
Samet Oymak
253
2
0
18 Jun 2025
Brewing Knowledge in Context: Distillation Perspectives on In-Context Learning
Brewing Knowledge in Context: Distillation Perspectives on In-Context Learning
Chengye Li
Haiyun Liu
Yuanxi Li
237
0
0
13 Jun 2025
Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods
Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods
Zhaiming Shen
Alexander Hsu
Rongjie Lai
Wenjing Liao
MLT
338
2
0
12 Jun 2025
CausalPFN: Amortized Causal Effect Estimation via In-Context Learning
CausalPFN: Amortized Causal Effect Estimation via In-Context Learning
Vahid Balazadeh
Hamidreza Kamkari
Valentin Thomas
Benson Li
Junwei Ma
Jesse C. Cresswell
Rahul G. Krishnan
CML
194
6
0
09 Jun 2025
Pre-trained Large Language Models Learn Hidden Markov Models In-context
Pre-trained Large Language Models Learn Hidden Markov Models In-context
Yijia Dai
Zhaolin Gao
Yahya Sattar
Sarah Dean
Jennifer J. Sun
270
1
0
08 Jun 2025
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
D. Kunin
Giovanni Luca Marchetti
F. Chen
Dhruva Karkada
James B. Simon
M. DeWeese
Surya Ganguli
Nina Miolane
417
4
0
06 Jun 2025
Neither Stochastic Parroting nor AGI: LLMs Solve Tasks through Context-Directed Extrapolation from Training Data Priors
Neither Stochastic Parroting nor AGI: LLMs Solve Tasks through Context-Directed Extrapolation from Training Data Priors
Harish Tayyar Madabushi
Melissa Torgbi
C. Bonial
351
3
0
29 May 2025
The Role of Diversity in In-Context Learning for Large Language Models
The Role of Diversity in In-Context Learning for Large Language Models
Wenyang Xiao
Haoyu Zhao
Lingxiao Huang
357
1
0
26 May 2025
Attention-based clustering
Attention-based clustering
Rodrigo Maulen-Soto
Claire Boyer
Pierre Marion
332
0
0
19 May 2025
Understanding In-context Learning of Addition via Activation Subspaces
Understanding In-context Learning of Addition via Activation Subspaces
Xinyan Hu
Kayo Yin
Michael I. Jordan
Jacob Steinhardt
Lijie Chen
452
7
0
08 May 2025
Compute-Optimal LLMs Provably Generalize Better With Scale
Compute-Optimal LLMs Provably Generalize Better With ScaleInternational Conference on Learning Representations (ICLR), 2025
Marc Finzi
Sanyam Kapoor
Diego Granziol
Anming Gu
Christopher De Sa
J. Zico Kolter
Andrew Gordon Wilson
412
5
0
21 Apr 2025
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear TransformersInternational Conference on Learning Representations (ICLR), 2025
Hongkang Li
Yihua Zhang
Shuai Zhang
Ming Wang
Sijia Liu
Pin-Yu Chen
MoMe
765
18
0
15 Apr 2025
Test-Time Training Provably Improves Transformers as In-context Learners
Test-Time Training Provably Improves Transformers as In-context Learners
Halil Alperen Gozeten
M. E. Ildiz
Xuechen Zhang
Mahdi Soltanolkotabi
Marco Mondelli
Samet Oymak
310
5
0
14 Mar 2025
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
Alireza Mousavi-Hosseini
Clayton Sanford
Denny Wu
Murat A. Erdogdu
347
3
0
14 Mar 2025
Provable Benefits of Task-Specific Prompts for In-context LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Xiangyu Chang
Yingcong Li
Muti Kara
Samet Oymak
Amit K. Roy-Chowdhury
398
1
0
03 Mar 2025
In-Context Learning with Hypothesis-Class Guidance
In-Context Learning with Hypothesis-Class Guidance
Ziqian Lin
Shubham Kumar Bharti
Kangwook Lee
481
0
0
27 Feb 2025
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training LoopsInternational Conference on Learning Representations (ICLR), 2025
Shi Fu
Yingjie Wang
Yuzhu Chen
Xinmei Tian
Dacheng Tao
371
8
0
26 Feb 2025
In-context Learning of Evolving Data Streams with Tabular Foundational Models
In-context Learning of Evolving Data Streams with Tabular Foundational Models
Afonso Lourenço
João Gama
Eric P. Xing
Goreti Marreiros
423
0
0
24 Feb 2025
On the Robustness of Transformers against Context Hijacking for Linear Classification
On the Robustness of Transformers against Context Hijacking for Linear Classification
Tianle Li
Chenyang Zhang
Xingwu Chen
Yuan Cao
Difan Zou
385
3
0
24 Feb 2025
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from GeneralizationInternational Conference on Learning Representations (ICLR), 2025
Zixuan Gong
Xiaolin Hu
Huayi Tang
Yong Liu
332
2
0
24 Feb 2025
Vector-ICL: In-context Learning with Continuous Vector Representations
Vector-ICL: In-context Learning with Continuous Vector RepresentationsInternational Conference on Learning Representations (ICLR), 2024
Yufan Zhuang
Chandan Singh
Liyuan Liu
Jingbo Shang
Jianfeng Gao
432
10
0
21 Feb 2025
Zero-shot Model-based Reinforcement Learning using Large Language Models
Zero-shot Model-based Reinforcement Learning using Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Khyati Khandelwal
Youssef Attia El Hili
Ambroise Odonnat
Oussama Zekri
Albert Thomas
Giuseppe Paolo
Maurizio Filippone
I. Redko
Jun Yao
OffRL
322
4
0
17 Feb 2025
Transformers versus the EM Algorithm in Multi-class Clustering
Yihan He
Hong-Yu Chen
Yuan Cao
Jianqing Fan
Han Liu
285
2
0
09 Feb 2025
Learning Task Representations from In-Context Learning
Learning Task Representations from In-Context LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Baturay Saglam
Zhuoran Yang
Zhuoran Yang
Dionysis Kalogerias
Amin Karbasi
278
7
0
08 Feb 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?International Conference on Learning Representations (ICLR), 2025
Yutong Yin
Zhaoran Wang
LRMReLM
1.2K
2
0
27 Jan 2025
Unlocking In-Context Learning for Natural Datasets Beyond Language Modelling
Unlocking In-Context Learning for Natural Datasets Beyond Language Modelling
Jelena Bratulić
Sudhanshu Mittal
David T. Hoffmann
Samuel Böhm
R. Schirrmeister
T. Ball
Christian Rupprecht
Thomas Brox
409
1
0
09 Jan 2025
In-Context Learning with Iterative Demonstration Selection
In-Context Learning with Iterative Demonstration SelectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Chengwei Qin
Aston Zhang
Chong Chen
Anirudh Dagar
Wenming Ye
LRM
450
73
0
31 Dec 2024
One-Layer Transformer Provably Learns One-Nearest Neighbor In ContextNeural Information Processing Systems (NeurIPS), 2024
Zihao Li
Yuan Cao
Cheng Gao
Yihan He
Han Liu
Jason M. Klusowski
Jianqing Fan
Mengdi Wang
MLT
413
13
0
16 Nov 2024
Toward Understanding In-context vs. In-weight Learning
Toward Understanding In-context vs. In-weight LearningInternational Conference on Learning Representations (ICLR), 2024
Bryan Chan
Xinyi Chen
András Gyorgy
Dale Schuurmans
407
11
0
30 Oct 2024
Multi-agent cooperation through learning-aware policy gradients
Multi-agent cooperation through learning-aware policy gradientsInternational Conference on Learning Representations (ICLR), 2024
Alexander Meulemans
Seijin Kobayashi
J. Oswald
Nino Scherrer
Eric Elmoznino
Blake A. Richards
Guillaume Lajoie
Blaise Agüera y Arcas
João Sacramento
213
1
0
24 Oct 2024
On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures
On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures
Wei Shen
Ruida Zhou
Jing Yang
Cong Shen
354
6
0
15 Oct 2024
Towards Understanding the Universality of Transformers for Next-Token Prediction
Towards Understanding the Universality of Transformers for Next-Token PredictionInternational Conference on Learning Representations (ICLR), 2024
Michael E. Sander
Gabriel Peyré
CML
320
4
0
03 Oct 2024
12
Next