v1v2 (latest)

Transformers as Algorithms: Generalization and Stability in In-context Learning

International Conference on Machine Learning (ICML), 2023

17 January 2023

Yingcong Li

M. E. Ildiz

Dimitris Papailiopoulos

Samet Oymak

ArXiv (abs)PDF HTML

Papers citing "Transformers as Algorithms: Generalization and Stability in In-context Learning"

36 / 86 papers shown

Large Language Models as Markov Chains

423

03 Oct 2024

Transformers Handle Endogeneity in In-Context Linear RegressionInternational Conference on Learning Representations (ICLR), 2024

Haodong Liang

Krishnakumar Balasubramanian

Lifeng Lai

520

02 Oct 2024

Zero-shot forecasting of chaotic systemsInternational Conference on Learning Representations (ICLR), 2024

Yuanzhao Zhang

William Gilpin

AI4TS

643

24 Sep 2024

Differentially Private Kernel Density Estimation

Erzhi Liu

Jerry Yao-Chieh Hu

Alex Reneau

Zhao Song

Han Liu

449

03 Sep 2024

A Statistical Framework for Data-dependent Retrieval-Augmented ModelsInternational Conference on Machine Learning (ICML), 2024

Soumya Basu

A. S. Rawat

Manzil Zaheer

RALM

271

27 Aug 2024

Spin glass model of in-context learningPhysical Review E (Phys. Rev. E), 2024

460

05 Aug 2024

Representing Rule-based Chatbots with Transformers

Dan Friedman

Abhishek Panigrahi

Danqi Chen

394

15 Jul 2024

Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning

620

07 Jun 2024

Why Larger Language Models Do In-context Learning Differently?

263

30 May 2024

Adaptive In-conversation Team Building for Language Model Agents

502

29 May 2024

Unsupervised Meta-Learning via In-Context Learning

361

25 May 2024

Asymptotic theory of in-context learning by linear attention

Yue M. Lu

Mary I. Letey

Jacob A. Zavatone-Veth

Anindita Maiti

Cengiz Pehlevan

522

20 May 2024

Concept-aware Data Construction Improves In-context Learning of Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Michal Štefánik

Marek Kadlcík

Petr Sojka

262

08 Mar 2024

Linear Transformers are Versatile In-Context Learners

206

21 Feb 2024

Pelican Soup Framework: A Theoretical Framework for Language Model Capabilities

Ting-Rui Chiang

Dani Yogatama

163

16 Feb 2024

Implicit Bias and Fast Convergence Rates for Self-attention

Bhavya Vasudeva

Puneesh Deora

Christos Thrampoulidis

377

08 Feb 2024

A phase transition between positional and semantic learning in a solvable model of dot-product attentionNeural Information Processing Systems (NeurIPS), 2024

Lenka Zdeborová

242

06 Feb 2024

Attention with Markov: A Framework for Principled Analysis of Transformers via Markov Chains

Ashok Vardhan Makkuva

382

06 Feb 2024

Superiority of Multi-Head Attention in In-Context Linear Regression

205

30 Jan 2024

An Information-Theoretic Analysis of In-Context LearningInternational Conference on Machine Learning (ICML), 2024

353

28 Jan 2024

Universal Vulnerabilities in Large Language Models: Backdoor Attacks for In-context LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

483

11 Jan 2024

Beyond Output Matching: Bidirectional Alignment for Enhanced In-Context Learning

355

28 Dec 2023

Looped Transformers are Better at Learning Learning AlgorithmsInternational Conference on Learning Representations (ICLR), 2023

Liu Yang

Kangwook Lee

Robert D. Nowak

Dimitris Papailiopoulos

438

21 Nov 2023

In-Context Learning Dynamics with Random Binary SequencesInternational Conference on Learning Representations (ICLR), 2023

420

26 Oct 2023

Function Vectors in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

303

182

23 Oct 2023

On the Optimization and Generalization of Multi-head Attention

Puneesh Deora

Rouzbeh Ghaderi

Hossein Taheri

Christos Thrampoulidis

MLT

276

19 Oct 2023

IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models

250

16 Oct 2023

In-Context Convergence of TransformersInternational Conference on Machine Learning (ICML), 2023

302

08 Oct 2023

Towards Better Chain-of-Thought Prompting Strategies: A Survey

342

08 Oct 2023

Are Emergent Abilities in Large Language Models just In-Context Learning?Annual Meeting of the Association for Computational Linguistics (ACL), 2023

Sheng Lu

Irina Bigoulaeva

Rachneet Sachdeva

Harish Tayyar Madabushi

Iryna Gurevych

LRM ELM ReLM

427

132

04 Sep 2023

Can Transformers Learn Optimal Filtering for Unknown Systems?IEEE Control Systems Letters (L-CSS), 2023

216

16 Aug 2023

Max-Margin Token Selection in Attention MechanismNeural Information Processing Systems (NeurIPS), 2023

Davoud Ataee Tarzanagh

Yingcong Li

Xuechen Zhang

Samet Oymak

507

23 Jun 2023

Trained Transformers Learn Linear Models In-ContextJournal of machine learning research (JMLR), 2023

Ruiqi Zhang

Spencer Frei

Peter L. Bartlett

409

277

16 Jun 2023

Schema-learning and rebinding as mechanisms of in-context learning and emergenceNeural Information Processing Systems (NeurIPS), 2023

Siva K. Swaminathan

Antoine Dedieu

Rajkumar Vasudeva Raju

Murray Shanahan

Miguel Lazaro-Gredilla

Dileep George

223

16 Jun 2023

What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and GeneralizationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

350

30 May 2023

Dissecting Chain-of-Thought: Compositionality through In-Context Filtering and Learning

Yingcong Li

Kartik K. Sreenivasan

Angeliki Giannou

Dimitris Papailiopoulos

Samet Oymak

LRM

239

30 May 2023