ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.07067
  4. Cited By
Transformers as Algorithms: Generalization and Stability in In-context
  Learning
v1v2 (latest)

Transformers as Algorithms: Generalization and Stability in In-context Learning

International Conference on Machine Learning (ICML), 2023
17 January 2023
Yingcong Li
M. E. Ildiz
Dimitris Papailiopoulos
Samet Oymak
ArXiv (abs)PDFHTML

Papers citing "Transformers as Algorithms: Generalization and Stability in In-context Learning"

36 / 86 papers shown
Large Language Models as Markov Chains
Large Language Models as Markov Chains
Oussama Zekri
Ambroise Odonnat
Khyati Khandelwal
Linus Bleistein
Nicolas Boullé
I. Redko
423
25
0
03 Oct 2024
Transformers Handle Endogeneity in In-Context Linear Regression
Transformers Handle Endogeneity in In-Context Linear RegressionInternational Conference on Learning Representations (ICLR), 2024
Haodong Liang
Krishnakumar Balasubramanian
Lifeng Lai
520
4
0
02 Oct 2024
Zero-shot forecasting of chaotic systems
Zero-shot forecasting of chaotic systemsInternational Conference on Learning Representations (ICLR), 2024
Yuanzhao Zhang
William Gilpin
AI4TS
643
16
0
24 Sep 2024
Differentially Private Kernel Density Estimation
Differentially Private Kernel Density Estimation
Erzhi Liu
Jerry Yao-Chieh Hu
Alex Reneau
Zhao Song
Han Liu
449
3
0
03 Sep 2024
A Statistical Framework for Data-dependent Retrieval-Augmented Models
A Statistical Framework for Data-dependent Retrieval-Augmented ModelsInternational Conference on Machine Learning (ICML), 2024
Soumya Basu
A. S. Rawat
Manzil Zaheer
RALM
271
1
0
27 Aug 2024
Spin glass model of in-context learning
Spin glass model of in-context learningPhysical Review E (Phys. Rev. E), 2024
Yuhao Li
Ruoran Bai
Haiping Huang
LRM
460
2
0
05 Aug 2024
Representing Rule-based Chatbots with Transformers
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
394
2
0
15 Jul 2024
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
Subhojyoti Mukherjee
Josiah P. Hanna
Qiaomin Xie
Robert D. Nowak
620
5
0
07 Jun 2024
Why Larger Language Models Do In-context Learning Differently?
Why Larger Language Models Do In-context Learning Differently?
Zhenmei Shi
Junyi Wei
Zhuoyan Xu
Yingyu Liang
263
45
0
30 May 2024
Adaptive In-conversation Team Building for Language Model Agents
Adaptive In-conversation Team Building for Language Model Agents
Linxin Song
Jiale Liu
Jieyu Zhang
Shaokun Zhang
Ao Luo
Shijian Wang
Qingyun Wu
Chi Wang
LLMAG
502
29
0
29 May 2024
Unsupervised Meta-Learning via In-Context Learning
Unsupervised Meta-Learning via In-Context Learning
Anna Vettoruzzo
Lorenzo Braccaioli
Joaquin Vanschoren
M. Nowaczyk
SSL
361
3
0
25 May 2024
Asymptotic theory of in-context learning by linear attention
Asymptotic theory of in-context learning by linear attention
Yue M. Lu
Mary I. Letey
Jacob A. Zavatone-Veth
Anindita Maiti
Cengiz Pehlevan
522
41
0
20 May 2024
Concept-aware Data Construction Improves In-context Learning of Language
  Models
Concept-aware Data Construction Improves In-context Learning of Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Michal Štefánik
Marek Kadlcík
Petr Sojka
262
2
0
08 Mar 2024
Linear Transformers are Versatile In-Context Learners
Linear Transformers are Versatile In-Context Learners
Max Vladymyrov
J. Oswald
Mark Sandler
Rong Ge
206
28
0
21 Feb 2024
Pelican Soup Framework: A Theoretical Framework for Language Model Capabilities
Pelican Soup Framework: A Theoretical Framework for Language Model Capabilities
Ting-Rui Chiang
Dani Yogatama
163
4
0
16 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
377
26
0
08 Feb 2024
A phase transition between positional and semantic learning in a
  solvable model of dot-product attention
A phase transition between positional and semantic learning in a solvable model of dot-product attentionNeural Information Processing Systems (NeurIPS), 2024
Hugo Cui
Freya Behrens
Florent Krzakala
Lenka Zdeborová
MLT
242
25
0
06 Feb 2024
Attention with Markov: A Framework for Principled Analysis of Transformers via Markov Chains
Attention with Markov: A Framework for Principled Analysis of Transformers via Markov Chains
Ashok Vardhan Makkuva
Marco Bondaschi
Adway Girish
Alliot Nagle
Martin Jaggi
Hyeji Kim
Michael C. Gastpar
OffRL
382
37
0
06 Feb 2024
Superiority of Multi-Head Attention in In-Context Linear Regression
Superiority of Multi-Head Attention in In-Context Linear Regression
Yingqian Cui
Jie Ren
Pengfei He
Shucheng Zhou
Yue Xing
205
21
0
30 Jan 2024
An Information-Theoretic Analysis of In-Context Learning
An Information-Theoretic Analysis of In-Context LearningInternational Conference on Machine Learning (ICML), 2024
Hong Jun Jeon
Jason D. Lee
Qi Lei
Benjamin Van Roy
353
34
0
28 Jan 2024
Universal Vulnerabilities in Large Language Models: Backdoor Attacks for
  In-context Learning
Universal Vulnerabilities in Large Language Models: Backdoor Attacks for In-context LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Shuai Zhao
Meihuizi Jia
Anh Tuan Luu
Fengjun Pan
Jinming Wen
AAML
483
69
0
11 Jan 2024
Beyond Output Matching: Bidirectional Alignment for Enhanced In-Context Learning
Beyond Output Matching: Bidirectional Alignment for Enhanced In-Context Learning
Chengwei Qin
Wenhan Xia
Fangkai Jiao
Chen Chen
Yuchen Hu
Bosheng Ding
R. Chen
Shafiq Joty
355
7
0
28 Dec 2023
Looped Transformers are Better at Learning Learning Algorithms
Looped Transformers are Better at Learning Learning AlgorithmsInternational Conference on Learning Representations (ICLR), 2023
Liu Yang
Kangwook Lee
Robert D. Nowak
Dimitris Papailiopoulos
438
55
0
21 Nov 2023
In-Context Learning Dynamics with Random Binary Sequences
In-Context Learning Dynamics with Random Binary SequencesInternational Conference on Learning Representations (ICLR), 2023
Eric J. Bigelow
Ekdeep Singh Lubana
Robert P. Dick
Hidenori Tanaka
T. Ullman
420
12
0
26 Oct 2023
Function Vectors in Large Language Models
Function Vectors in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Eric Todd
Millicent Li
Arnab Sen Sharma
Aaron Mueller
Byron C. Wallace
David Bau
303
182
0
23 Oct 2023
On the Optimization and Generalization of Multi-head Attention
On the Optimization and Generalization of Multi-head Attention
Puneesh Deora
Rouzbeh Ghaderi
Hossein Taheri
Christos Thrampoulidis
MLT
276
41
0
19 Oct 2023
IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models
IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models
Shaokun Zhang
Xiaobo Xia
Zhaoqing Wang
Ling-Hao Chen
Jiale Liu
Qingyun Wu
Tongliang Liu
250
26
0
16 Oct 2023
In-Context Convergence of Transformers
In-Context Convergence of TransformersInternational Conference on Machine Learning (ICML), 2023
Yu Huang
Yuan Cheng
Yingbin Liang
MLT
302
96
0
08 Oct 2023
Towards Better Chain-of-Thought Prompting Strategies: A Survey
Towards Better Chain-of-Thought Prompting Strategies: A Survey
Zihan Yu
Liang He
Zhen Wu
Xinyu Dai
Jiajun Chen
LRM
342
82
0
08 Oct 2023
Are Emergent Abilities in Large Language Models just In-Context
  Learning?
Are Emergent Abilities in Large Language Models just In-Context Learning?Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Sheng Lu
Irina Bigoulaeva
Rachneet Sachdeva
Harish Tayyar Madabushi
Iryna Gurevych
LRMELMReLM
427
132
0
04 Sep 2023
Can Transformers Learn Optimal Filtering for Unknown Systems?
Can Transformers Learn Optimal Filtering for Unknown Systems?IEEE Control Systems Letters (L-CSS), 2023
Haldun Balim
Zhe Du
Samet Oymak
N. Ozay
216
13
0
16 Aug 2023
Max-Margin Token Selection in Attention Mechanism
Max-Margin Token Selection in Attention MechanismNeural Information Processing Systems (NeurIPS), 2023
Davoud Ataee Tarzanagh
Yingcong Li
Xuechen Zhang
Samet Oymak
507
51
0
23 Jun 2023
Trained Transformers Learn Linear Models In-Context
Trained Transformers Learn Linear Models In-ContextJournal of machine learning research (JMLR), 2023
Ruiqi Zhang
Spencer Frei
Peter L. Bartlett
409
277
0
16 Jun 2023
Schema-learning and rebinding as mechanisms of in-context learning and
  emergence
Schema-learning and rebinding as mechanisms of in-context learning and emergenceNeural Information Processing Systems (NeurIPS), 2023
Siva K. Swaminathan
Antoine Dedieu
Rajkumar Vasudeva Raju
Murray Shanahan
Miguel Lazaro-Gredilla
Dileep George
223
22
0
16 Jun 2023
What and How does In-Context Learning Learn? Bayesian Model Averaging,
  Parameterization, and Generalization
What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and GeneralizationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Yufeng Zhang
Fengzhuo Zhang
Zhuoran Yang
Zhaoran Wang
BDL
350
93
0
30 May 2023
Dissecting Chain-of-Thought: Compositionality through In-Context
  Filtering and Learning
Dissecting Chain-of-Thought: Compositionality through In-Context Filtering and Learning
Yingcong Li
Kartik K. Sreenivasan
Angeliki Giannou
Dimitris Papailiopoulos
Samet Oymak
LRM
239
21
0
30 May 2023
Previous
12