v1v2 (latest)

Pelican Soup Framework: A Theoretical Framework for Language Model Capabilities

16 February 2024

Ting-Rui Chiang

Dani Yogatama

ArXiv (abs)PDF HTML

Main:8 Pages

8 Figures

Bibliography:4 Pages

6 Tables

Appendix:10 Pages

Abstract

In this work, we propose a simple theoretical framework, Pelican Soup, aiming to better understand how pretraining allows LLMs to (1) generalize to unseen instructions and (2) perform in-context learning, even when the verbalizers are irrelevant to the task. To this end, in our framework, we introduce the notion of "knowledge base" and "reference-sense association" and a simple formalism for natural language processing tasks. Our framework demonstrates how linguistic, psychology, and philosophy studies can inform our understanding of the language model and is connected to several other existing theoretical results. As an illustration of the usage of our framework, we derive a bound on in-context learning loss with our framework. Finally, we support our framework with empirical experiments and provide possible future research directions.

View on arXiv

Comments on this paper