Position: Understanding LLMs Requires More Than Statistical
Generalization

Position: Understanding LLMs Requires More Than Statistical Generalization

3 May 2024

Patrik Reizinger

Szilvia Ujváry

Anna Mészáros

Wieland Brendel

Papers citing "Position: Understanding LLMs Requires More Than Statistical Generalization"

11 / 11 papers shown

Title
Out-of-distribution generalization via composition: a lens through induction heads in Transformers Jiajun Song Zhuoyan Xu Yiqiao Zhong 67 4 0 31 Dec 2024
All or None: Identifiable Linear Properties of Next-token Predictors in Language Modeling Emanuele Marconato Sébastien Lachapelle Sebastian Weichwald Luigi Gresele 55 3 0 30 Oct 2024
Towards Understanding the Relationship between In-context Learning and Compositional Generalization Sungjun Han Sebastian Padó CoGe 16 2 0 18 Mar 2024
On Provable Length and Compositional Generalization Kartik Ahuja Amin Mansouri OODD 18 7 0 07 Feb 2024
Understanding the Effects of RLHF on LLM Generalisation and Diversity Robert Kirk Ishita Mediratta Christoforos Nalmpantis Jelena Luketina Eric Hambro Edward Grefenstette Roberta Raileanu AI4CE ALM 95 121 0 10 Oct 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
On PAC-Bayesian reconstruction guarantees for VAEs Badr-Eddine Chérief-Abdellatif Yuyang Shi Arnaud Doucet Benjamin Guedj DRL 35 17 0 23 Feb 2022
Improving Systematic Generalization Through Modularity and Augmentation Laura Ruis Brenden Lake 30 16 0 22 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 315 8,261 0 28 Jan 2022
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers Yi Tay Mostafa Dehghani J. Rao W. Fedus Samira Abnar Hyung Won Chung Sharan Narang Dani Yogatama Ashish Vaswani Donald Metzler 183 89 0 22 Sep 2021
Simpler PAC-Bayesian Bounds for Hostile Data Pierre Alquier Benjamin Guedj 71 72 0 23 Oct 2016