Data Distributional Properties Drive Emergent In-Context Learning in Transformers

22 April 2022

Pierre Harvey Richemond

J. Mcclelland

Felix Hill

ArXiv PDF HTML

Papers citing "Data Distributional Properties Drive Emergent In-Context Learning in Transformers"

50 / 174 papers shown

Title
Learned feature representations are biased by complexity, learning order, position, and more Andrew Kyle Lampinen Stephanie C. Y. Chan Katherine Hermann AI4CE FaML SSL OOD 19 5 0 09 May 2024
In-Context Learning State Vector with Inner and Momentum Optimization Dongfang Li Zhenyu Liu Xinshuo Hu Zetian Sun Baotian Hu Min Zhang 21 5 0 17 Apr 2024
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance Vishaal Udandarao Ameya Prabhu Adhiraj Ghosh Yash Sharma Philip H. S. Torr Adel Bibi Samuel Albanie Matthias Bethge VLM 112 43 0 04 Apr 2024
The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition Georgios Chochlakis Alexandros Potamianos Kristina Lerman Shrikanth Narayanan 12 5 0 25 Mar 2024
Computational Models to Study Language Processing in the Human Brain: A Survey Shaonan Wang Jingyuan Sun Yunhao Zhang Nan Lin Marie-Francine Moens Chengqing Zong 14 5 0 20 Mar 2024
Towards Understanding the Relationship between In-context Learning and Compositional Generalization Sungjun Han Sebastian Padó CoGe 16 2 0 18 Mar 2024
StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation Jinpeng Li Zekai Zhang Quan Tu Xin Cheng Dongyan Zhao Rui Yan 34 2 0 18 Mar 2024
Concept-aware Data Construction Improves In-context Learning of Language Models Michal Štefánik Marek Kadlcík Petr Sojka 38 0 0 08 Mar 2024
Not All Layers of LLMs Are Necessary During Inference Siqi Fan Xin Jiang Xiang Li Xuying Meng Peng Han Shuo Shang Aixin Sun Yequan Wang Zhongyuan Wang 38 32 0 04 Mar 2024
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models Haoran Liao Jidong Tian Shaohua Hu Hao He Yaohui Jin ReLM LRM 28 1 0 24 Feb 2024
Balanced Data Sampling for Language Model Training with Clustering Yunfan Shao Linyang Li Zhaoye Fei Hang Yan Dahua Lin Xipeng Qiu 24 8 0 22 Feb 2024
Linear Transformers are Versatile In-Context Learners Max Vladymyrov J. Oswald Mark Sandler Rong Ge 18 13 0 21 Feb 2024
Analysing The Impact of Sequence Composition on Language Model Pre-Training Yu Zhao Yuanbin Qu Konrad Staniszewski Szymon Tworkowski Wei Liu Piotr Milo's Yuxiang Wu Pasquale Minervini 24 13 0 21 Feb 2024
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains Benjamin L. Edelman Ezra Edelman Surbhi Goel Eran Malach Nikolaos Tsilivis BDL 18 39 0 16 Feb 2024
Bridging Associative Memory and Probabilistic Modeling Rylan Schaeffer Nika Zahedi Mikail Khona Dhruv Pai Sang T. Truong ... Sarthak Chandra Andres Carranza Ila Rani Fiete Andrey Gromov Oluwasanmi Koyejo DiffM 43 2 0 15 Feb 2024
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks Jongho Park Jaeseung Park Zheyang Xiong Nayoung Lee Jaewoong Cho Samet Oymak Kangwook Lee Dimitris Papailiopoulos 16 31 0 06 Feb 2024
In-context learning agents are asymmetric belief updaters Johannes A. Schubert A. Jagadish Marcel Binz Eric Schulz LLMAG 8 4 0 06 Feb 2024
Is Mamba Capable of In-Context Learning? Riccardo Grazzi Julien N. Siems Simon Schrodi Thomas Brox Frank Hutter 18 20 0 05 Feb 2024
Human-like Category Learning by Injecting Ecological Priors from Large Language Models into Neural Networks A. Jagadish Julian Coda-Forno Mirko Thalmann Eric Schulz Marcel Binz 16 3 0 02 Feb 2024
In-Context Language Learning: Architectures and Algorithms Ekin Akyürek Bailin Wang Yoon Kim Jacob Andreas LRM ReLM 32 16 0 23 Jan 2024
Enhancing In-context Learning via Linear Probe Calibration Momin Abbas Yi Zhou Parikshit Ram Nathalie Baracaldo Horst Samulowitz Theodoros Salonidis Tianyi Chen 53 9 0 22 Jan 2024
An Empirical Study of In-context Learning in LLMs for Machine Translation Pranjal A. Chitale Jay Gala Raj Dabre LRM 13 5 0 22 Jan 2024
Anchor function: a type of benchmark functions for studying language models Zhongwang Zhang Zhiwei Wang Junjie Yao Zhangchen Zhou Xiaolong Li E. Weinan Z. Xu 29 5 0 16 Jan 2024
Universal Vulnerabilities in Large Language Models: Backdoor Attacks for In-context Learning Shuai Zhao Meihuizi Jia Anh Tuan Luu Fengjun Pan Jinming Wen AAML 29 34 0 11 Jan 2024
Grimoire is All You Need for Enhancing Large Language Models Ding Chen Shichao Song Qingchen Yu Zhiyu Li Wenjin Wang Feiyu Xiong Bo Tang 25 4 0 07 Jan 2024
Memory, Consciousness and Large Language Model Jitang Li Jinzheng Li 8 5 0 04 Jan 2024
Structured Packing in LLM Training Improves Long Context Utilization Konrad Staniszewski Szymon Tworkowski Sebastian Jaszczur Yu Zhao Henryk Michalewski Lukasz Kuciñski Piotr Milo's 28 13 0 28 Dec 2023
Improving In-context Learning via Bidirectional Alignment Chengwei Qin Wenhan Xia Fangkai Jiao Chen Chen Yuchen Hu Bosheng Ding Shafiq R. Joty 35 7 0 28 Dec 2023
TinyGSM: achieving >80% on GSM8k with small language models Bingbin Liu Sébastien Bubeck Ronen Eldan Janardhan Kulkarni Yuanzhi Li Anh Nguyen Rachel A. Ward Yi Zhang ALM 19 47 0 14 Dec 2023
Large Language Model Enhanced Multi-Agent Systems for 6G Communications Feibo Jiang Li Dong Yubo Peng Kezhi Wang Kun Yang Cunhua Pan Dusit Niyato O. Dobre LLMAG 19 35 0 13 Dec 2023
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks Jiarui Xu Yossi Gandelsman Amir Bar Jianwei Yang Jianfeng Gao Trevor Darrell Xiaolong Wang VLM 16 3 0 04 Dec 2023
The Contemporary Art of Image Search: Iterative User Intent Expansion via Vision-Language Model Yilin Ye Qian Zhu Shishi Xiao Kang Zhang Wei Zeng 28 3 0 04 Dec 2023
The mechanistic basis of data dependence and abrupt learning in an in-context classification task Gautam Reddy 17 48 0 03 Dec 2023
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks Rahul Ramesh Ekdeep Singh Lubana Mikail Khona Robert P. Dick Hidenori Tanaka CoGe 22 6 0 21 Nov 2023
Exploring the Relationship between In-Context Learning and Instruction Tuning Hanyu Duan Yixuan Tang Yi Yang Ahmed Abbasi K. Tam 21 4 0 17 Nov 2023
The Transient Nature of Emergent In-Context Learning in Transformers Aaditya K. Singh Stephanie C. Y. Chan Ted Moskovitz Erin Grant Andrew M. Saxe Felix Hill 62 31 0 14 Nov 2023
On Using Distribution-Based Compositionality Assessment to Evaluate Compositional Generalisation in Machine Translation Anssi Moisio Mathias Creutz M. Kurimo CoGe 10 1 0 14 Nov 2023
Data Factors for Better Compositional Generalization Xiang Zhou Yichen Jiang Mohit Bansal CoGe OOD 11 1 0 08 Nov 2023
Exploring Dataset-Scale Indicators of Data Quality Ben Feuer Chinmay Hegde 11 1 0 07 Nov 2023
The Mystery of In-Context Learning: A Comprehensive Survey on Interpretation and Analysis Yuxiang Zhou Jiazheng Li Yanzheng Xiang Hanqi Yan Lin Gui Yulan He 22 13 0 01 Nov 2023
Transformers are Provably Optimal In-context Estimators for Wireless Communications Vishnu Teja Kunde Vicram Rajagopalan Chandra Shekhara Kaushik Valmeekam Krishna R. Narayanan S. Shakkottai D. Kalathil J. Chamberland 29 4 0 01 Nov 2023
Which Examples to Annotate for In-Context Learning? Towards Effective and Efficient Selection Costas Mavromatis Balasubramaniam Srinivasan Zhengyuan Shen Jiani Zhang Huzefa Rangwala Christos Faloutsos George Karypis 19 21 0 30 Oct 2023
Gen2Sim: Scaling up Robot Learning in Simulation with Generative Models Pushkal Katara Zhou Xian Katerina Fragkiadaki LM&Ro 35 34 0 27 Oct 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time Zichang Liu Jue Wang Tri Dao Tianyi Zhou Binhang Yuan ... Anshumali Shrivastava Ce Zhang Yuandong Tian Christopher Ré Beidi Chen BDL 11 189 0 26 Oct 2023
Exploring Question Decomposition for Zero-Shot VQA Zaid Khan B. Vijaykumar S. Schulter Manmohan Chandraker Yun Fu ReLM 17 9 0 25 Oct 2023
Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks Aradhana Sinha Ananth Balashankar Ahmad Beirami Thi Avrahami Jilin Chen Alex Beutel AAML 17 4 0 25 Oct 2023
MindLLM: Pre-training Lightweight Large Language Model from Scratch, Evaluations and Domain Applications Yizhe Yang Huashan Sun Jiawei Li Runheng Liu Yinghao Li Yuhang Liu Heyan Huang Yang Gao ALM LRM 8 8 0 24 Oct 2023
Are LSTMs Good Few-Shot Learners? Mike Huisman Thomas M. Moerland Aske Plaat Jan N. van Rijn VLM 8 7 0 22 Oct 2023
Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems David T. Hoffmann Simon Schrodi Jelena Bratulić Nadine Behrmann Volker Fischer Thomas Brox 19 3 0 19 Oct 2023
IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models Shaokun Zhang Xiaobo Xia Zhaoqing Wang Ling-Hao Chen Jiale Liu Qingyun Wu Tongliang Liu 26 20 0 16 Oct 2023