Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.10091
Cited By
Look Before You Leap: A Universal Emergent Decomposition of Retrieval Tasks in Language Models
13 December 2023
Alexandre Variengien
Eric Winsor
LRM
ReLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Look Before You Leap: A Universal Emergent Decomposition of Retrieval Tasks in Language Models"
6 / 6 papers shown
Title
Activation Steering in Neural Theorem Provers
Shashank Kirtania
LLMSV
59
0
0
21 Feb 2025
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Michael Hanna
Ollie Liu
Alexandre Variengien
LRM
173
116
0
30 Apr 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
180
152
0
28 Apr 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
197
2,232
0
22 Mar 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
205
486
0
01 Nov 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1