Monotonic Representation of Numeric Properties in Language Models

Monotonic Representation of Numeric Properties in Language Models

15 March 2024

Benjamin Heinzerling

Papers citing "Monotonic Representation of Numeric Properties in Language Models"

11 / 11 papers shown

Title
All or None: Identifiable Linear Properties of Next-token Predictors in Language Modeling Emanuele Marconato Sébastien Lachapelle Sebastian Weichwald Luigi Gresele 50 3 0 30 Oct 2024
The Geometry of Concepts: Sparse Autoencoder Feature Structure Yuxiao Li Eric J. Michaud David D. Baek Joshua Engels Xiaoqing Sun Max Tegmark 35 7 0 10 Oct 2024
Feature contamination: Neural networks learn uncorrelated features and fail to generalize Tianren Zhang Chujie Zhao Guanyu Chen Yizhou Jiang Feng Chen OOD MLT OODD 52 2 0 05 Jun 2024
On the Origins of Linear Representations in Large Language Models Yibo Jiang Goutham Rajendran Pradeep Ravikumar Bryon Aragam Victor Veitch 51 24 0 06 Mar 2024
AtP*: An efficient and scalable method for localizing LLM behaviour to components János Kramár Tom Lieberum Rohin Shah Neel Nanda KELM 38 42 0 01 Mar 2024
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets Samuel Marks Max Tegmark HILM 91 164 0 10 Oct 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language Models Mor Geva Jasmijn Bastings Katja Filippova Amir Globerson KELM 186 260 0 28 Apr 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small Kevin Wang Alexandre Variengien Arthur Conmy Buck Shlegeris Jacob Steinhardt 207 486 0 01 Nov 2022
Fast Model Editing at Scale E. Mitchell Charles Lin Antoine Bosselut Chelsea Finn Christopher D. Manning KELM 217 254 0 21 Oct 2021
Do Language Models Know the Way to Rome? Bastien Liétard Mostafa Abdou Anders Søgaard 38 15 0 16 Sep 2021
Language Models as Knowledge Bases? Fabio Petroni Tim Rocktaschel Patrick Lewis A. Bakhtin Yuxiang Wu Alexander H. Miller Sebastian Riedel KELM AI4MH 391 2,216 0 03 Sep 2019