On the Origins of Linear Representations in Large Language Models

6 March 2024

Papers citing "On the Origins of Linear Representations in Large Language Models"

8 / 8 papers shown

Title
Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models Ruta Binkyte Ivaxi Sheth Zhijing Jin Mohammad Havaei Bernhard Schölkopf Mario Fritz 49 0 0 28 Feb 2025
Lines of Thought in Large Language Models Raphael Sarfati Toni J. B. Liu Nicolas Boullé Christopher Earls LRM VLM LM&Ro 55 1 0 17 Feb 2025
ResiDual Transformer Alignment with Spectral Decomposition Lorenzo Basile Valentino Maiorca Luca Bortolussi Emanuele Rodolà Francesco Locatello 43 1 0 31 Oct 2024
All or None: Identifiable Linear Properties of Next-token Predictors in Language Modeling Emanuele Marconato Sébastien Lachapelle Sebastian Weichwald Luigi Gresele 55 3 0 30 Oct 2024
Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models Goutham Rajendran Simon Buchholz Bryon Aragam Bernhard Schölkopf Pradeep Ravikumar AI4CE 78 19 0 14 Feb 2024
Finding Neurons in a Haystack: Case Studies with Sparse Probing Wes Gurnee Neel Nanda Matthew Pauly Katherine Harvey Dmitrii Troitskii Dimitris Bertsimas MILM 153 170 0 02 May 2023
Toy Models of Superposition Nelson Elhage Tristan Hume Catherine Olsson Nicholas Schiefer T. Henighan ... Sam McCandlish Jared Kaplan Dario Amodei Martin Wattenberg C. Olah AAML MILM 117 314 0 21 Sep 2022
Contrastive Learning Inverts the Data Generating Process Roland S. Zimmermann Yash Sharma Steffen Schneider Matthias Bethge Wieland Brendel SSL 230 206 0 17 Feb 2021