v1v2 (latest)

What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization

International Conference on Artificial Intelligence and Statistics (AISTATS), 2023

30 May 2023

Papers citing "What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization"

50 / 62 papers shown

Title
Provable test-time adaptivity and distributional robustness of in-context learning Tianyi Ma Tengyao Wang R. Samworth 84 1 0 27 Oct 2025
A Framework for Quantifying How Pre-Training and Context Benefit In-Context Learning Bingqing Song Jiaxiang Li Rong Wang Songtao Lu Mingyi Hong 60 0 0 26 Oct 2025
In-Context Learning Is Provably Bayesian Inference: A Generalization Theory for Meta-Learning Tomoya Wakayama Taiji Suzuki UQCV BDL 175 2 0 13 Oct 2025
Pretrain-Test Task Alignment Governs Generalization in In-Context Learning Mary I. Letey Jacob A. Zavatone-Veth Yue M. Lu Cengiz Pehlevan 93 1 0 30 Sep 2025
Provable In-Context Learning of Nonlinear Regression with Transformers Hongbo Li Lingjie Duan Yingbin Liang 119 1 0 28 Jul 2025
Brewing Knowledge in Context: Distillation Perspectives on In-Context Learning Chengye Li Haiyun Liu Yuanxi Li 189 0 0 13 Jun 2025
From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium Xie Yi Zhanke Zhou Chentao Cao Qiyu Niu Tongliang Liu Bo Han 186 4 0 09 Jun 2025
Neither Stochastic Parroting nor AGI: LLMs Solve Tasks through Context-Directed Extrapolation from Training Data Priors Harish Tayyar Madabushi Melissa Torgbi C. Bonial 283 3 0 29 May 2025
The Role of Diversity in In-Context Learning for Large Language Models Wenyang Xiao Haoyu Zhao Lingxiao Huang 305 1 0 26 May 2025
Adversarially Pretrained Transformers May Be Universally Robust In-Context Learners Soichiro Kumano Hiroshi Kera Toshihiko Yamasaki AAML 390 1 0 20 May 2025
Toward Efficient Exploration by Large Language Model Agents Dilip Arumugam Thomas L. Griffiths LLMAG 358 8 0 29 Apr 2025
A Theoretical Framework for OOD Robustness in Transformers using Gevrey Classes Yu Wang Fu-Chieh Chang Pei-Yuan Wu OODD ReLM LRM 214 0 0 17 Apr 2025
Reasoning without Regret Tarun Chitra OffRL LRM 154 0 0 14 Apr 2025
Enough Coin Flips Can Make LLMs Act BayesianAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Ritwik Gupta Rodolfo Corona Jiaxin Ge Eric Wang Dan Klein Trevor Darrell David M. Chan BDL LRM 223 10 0 06 Mar 2025
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training LoopsInternational Conference on Learning Representations (ICLR), 2025 Shi Fu Yingjie Wang Yuzhu Chen Xinmei Tian Dacheng Tao 267 7 0 26 Feb 2025
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from GeneralizationInternational Conference on Learning Representations (ICLR), 2025 Zixuan Gong Xiaolin Hu Huayi Tang Yong Liu 292 2 0 24 Feb 2025
AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification Jue Chen Tianchu Yao Chao Qu Bin Li Minghao Yang ... Haozhe Wang Xihe Qiu Wei Chu Yinghui Xu Yuan Qi OffRL LRM 274 12 0 17 Feb 2025
Zero-shot Model-based Reinforcement Learning using Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024 Khyati Khandelwal Youssef Attia El Hili Ambroise Odonnat Oussama Zekri Albert Thomas Giuseppe Paolo Maurizio Filippone I. Redko Jun Yao OffRL 246 3 0 17 Feb 2025
Learning Task Representations from In-Context LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Baturay Saglam Zhuoran Yang Zhuoran Yang Dionysis Kalogerias Amin Karbasi 242 6 0 08 Feb 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?International Conference on Learning Representations (ICLR), 2025 Yutong Yin Zhaoran Wang LRM ReLM 1.0K 2 0 27 Jan 2025
Rethinking Associative Memory Mechanism in Induction Head Shuo Wang Issei Sato 357 0 0 16 Dec 2024
Re-examining learning linear functions in contextDeutsche Jahrestagung für Künstliche Intelligenz (KI), 2024 Omar Naim Guilhem Fouilhé Nicholas Asher 372 4 0 18 Nov 2024
Pretrained transformer efficiently learns low-dimensional target functions in-contextNeural Information Processing Systems (NeurIPS), 2024 Kazusato Oko Yujin Song Taiji Suzuki Denny Wu 234 22 0 04 Nov 2024
Bayesian scaling laws for in-context learning Aryaman Arora Dan Jurafsky Christopher Potts Noah D. Goodman 373 11 0 21 Oct 2024
A Theoretical Survey on Foundation Models Shi Fu Yuzhu Chen Yingjie Wang Dacheng Tao 247 0 0 15 Oct 2024
On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures Wei Shen Ruida Zhou Jing Yang Cong Shen 266 6 0 15 Oct 2024
Can In-context Learning Really Generalize to Out-of-distribution Tasks?International Conference on Learning Representations (ICLR), 2024 Qixun Wang Yifei Wang Yisen Wang Xianghua Ying OOD 171 15 0 13 Oct 2024
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition Zheyang Xiong Ziyang Cai John Cooper Albert Ge Vasilis Papageorgiou ... Saurabh Agarwal Grigorios G Chrysos Samet Oymak Kangwook Lee Dimitris Papailiopoulos LRM 203 8 0 08 Oct 2024
Large Language Models as Markov Chains Oussama Zekri Ambroise Odonnat Khyati Khandelwal Linus Bleistein Nicolas Boullé I. Redko 327 25 0 03 Oct 2024
In-Context Learning with Representations: Contextual Generalization of Trained TransformersNeural Information Processing Systems (NeurIPS), 2024 Tong Yang Yu Huang Yingbin Liang Yuejie Chi MLT 245 27 0 19 Aug 2024
Pre-training and in-context learning IS Bayesian inference a la De Finetti Naimeng Ye Hanming Yang Andrew Siah Hongseok Namkoong BDL UQLM 228 3 0 06 Aug 2024
Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy Reasoning Akshara Prabhakar Thomas Griffiths R. Thomas McCoy LRM 197 29 0 01 Jul 2024
Estimating the Hallucination Rate of Generative AI Andrew Jesson Nicolas Beltran-Velez Quentin Chu Sweta Karlekar Jannik Kossen Yarin Gal John P. Cunningham David M. Blei 411 25 0 11 Jun 2024
Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective Xinhao Yao Xiaolin Hu Shenzhi Yang Yong Liu 184 3 0 06 Jun 2024
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective Fabian Falck Ziyu Wang Chris Holmes 330 36 0 02 Jun 2024
Mind the Inconspicuous: Revealing the Hidden Weakness in Aligned LLMs' Refusal Boundaries Jiahao Yu Haozheng Luo Jerry Yao-Chieh Hu Wenbo Guo Han Liu Xinyu Xing 212 21 0 31 May 2024
A Theoretical Understanding of Self-Correction through In-context Alignment Yifei Wang Yuyang Wu Zeming Wei Stefanie Jegelka Yisen Wang LRM 218 51 0 28 May 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers Lorenzo Tiberi Francesca Mignacco Kazuki Irie H. Sompolinsky 315 9 0 24 May 2024
Towards Better Understanding of In-Context Learning Ability from In-Context Uncertainty Quantification Shang Liu Zhongze Cai Guanting Chen Xiaocheng Li UQCV 173 2 0 24 May 2024
Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making Hanzhao Wang Yu Pan Fupeng Sun Shang Liu Kalyan Talluri Guanting Chen Xiaocheng Li OffRL 212 2 0 23 May 2024
Can large language models explore in-context?Neural Information Processing Systems (NeurIPS), 2024 Akshay Krishnamurthy Keegan Harris Dylan J. Foster Cyril Zhang Aleksandrs Slivkins LM&Ro LLMAG LRM 498 49 0 22 Mar 2024
Rectifying Demonstration Shortcut in In-Context LearningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Joonwon Jang Sanghwan Jang Wonbin Kweon Minjin Jeon Hwanjo Yu 235 4 0 14 Mar 2024
Understanding In-Context Learning with a Pelican Soup Framework Ting-Rui Chiang Dani Yogatama 119 4 0 16 Feb 2024
Position: Graph Foundation Models are Already Here Haitao Mao Zhikai Chen Wenzhuo Tang Jianan Zhao Yao Ma Tong Zhao Neil Shah Mikhail Galkin Shucheng Zhou AI4CE 306 69 0 03 Feb 2024
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape Juno Kim Taiji Suzuki 308 33 0 02 Feb 2024
An Information-Theoretic Analysis of In-Context LearningInternational Conference on Machine Learning (ICML), 2024 Hong Jun Jeon Jason D. Lee Qi Lei Benjamin Van Roy 305 33 0 28 Jan 2024
Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMsInternational Conference on Machine Learning (ICML), 2023 Andries P. Smit Paul Duckworth Nathan Grinsztajn Thomas D. Barrett Arnu Pretorius 310 51 0 29 Nov 2023
A Principled Framework for Knowledge-enhanced Large Language Model Saizhuo Wang Zhihan Liu Zhaoran Wang Jian Guo LRM 126 1 0 18 Nov 2023
Gen-Z: Generative Zero-Shot Text Classification with Contextualized Label DescriptionsInternational Conference on Learning Representations (ICLR), 2023 Sachin Kumar Chan Young Park Yulia Tsvetkov VLM 167 5 0 13 Nov 2023
The Mystery of In-Context Learning: A Comprehensive Survey on Interpretation and AnalysisConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Yuxiang Zhou Jiazheng Li Yanzheng Xiang Hanqi Yan Lin Gui Yulan He 259 29 0 01 Nov 2023