Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2202.05262
Cited By
v1
v2
v3
v4
v5 (latest)
Locating and Editing Factual Associations in GPT
Neural Information Processing Systems (NeurIPS), 2022
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Locating and Editing Factual Associations in GPT"
50 / 1,361 papers shown
Latent Causal Probing: A Formal Perspective on Probing with Causal Models of Data
Charles Jin
Martin Rinard
219
3
0
18 Jul 2024
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
Jaden Fiotto-Kaufman
Alexander R. Loftus
Eric Todd
Jannik Brinkmann
Caden Juang
...
Carla Brodley
Arjun Guha
Jonathan Bell
Byron C. Wallace
David Bau
381
6
0
18 Jul 2024
Retrieval-Augmented Generation for Natural Language Processing: A Survey
Shangyu Wu
Ying Xiong
Yufei Cui
Haolun Wu
Can Chen
...
Lianming Huang
Xue Liu
Tei-Wei Kuo
Nan Guan
Chun Jason Xue
3DV
RALM
465
97
0
18 Jul 2024
Establishing Knowledge Preference in Language Models
Sizhe Zhou
Sha Li
Yu Meng
Yizhu Jiao
Heng Ji
Jiawei Han
KELM
236
0
0
17 Jul 2024
LLM Circuit Analyses Are Consistent Across Training and Scale
Curt Tigges
Michael Hanna
Qinan Yu
Stella Biderman
332
32
0
15 Jul 2024
How and where does CLIP process negation?
Vincent Quantmeyer
Pablo Mosteiro
Albert Gatt
CoGe
246
11
0
15 Jul 2024
Cross-Lingual Multi-Hop Knowledge Editing
Aditi Khandelwal
Harman Singh
Hengrui Gu
Tianlong Chen
Kaixiong Zhou
KELM
148
0
0
14 Jul 2024
On Large Language Model Continual Unlearning
Chongyang Gao
Lixu Wang
Chenkai Weng
Tianlin Li
Qi Zhu
Qi Zhu
MU
268
0
0
14 Jul 2024
A Survey on Symbolic Knowledge Distillation of Large Language Models
Kamal Acharya
Alvaro Velasquez
Haoze Song
SyDa
288
23
0
12 Jul 2024
Transformer Circuit Faithfulness Metrics are not Robust
Joseph Miller
Bilal Chughtai
William Saunders
207
9
0
11 Jul 2024
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Huanqian Wang
Yang Yue
Rui Lu
Jingxin Shi
Andrew Zhao
Shenzhi Wang
Shiji Song
Gao Huang
LM&Ro
KELM
428
16
0
11 Jul 2024
Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models
Yuji Zhang
Sha Li
Jiateng Liu
Pengfei Yu
Yi R. Fung
Jing Li
Pengfei Yu
Heng Ji
381
27
0
10 Jul 2024
Uncovering Layer-Dependent Activation Sparsity Patterns in ReLU Transformers
Cody Wild
Jesper Anderson
MoE
190
0
0
10 Jul 2024
Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
K. Kenthapadi
M. Sameki
Ankur Taly
HILM
ELM
AILaw
219
33
0
10 Jul 2024
Composable Interventions for Language Models
Arinbjorn Kolbeinsson
Kyle O'Brien
Tianjin Huang
Shanghua Gao
Shiwei Liu
...
Anurag J. Vaidya
Faisal Mahmood
Marinka Zitnik
Tianlong Chen
Thomas Hartvigsen
KELM
MU
519
4
0
09 Jul 2024
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
Weijia Shi
Jaechan Lee
Yangsibo Huang
Sadhika Malladi
Jieyu Zhao
Ari Holtzman
Daogao Liu
Luke Zettlemoyer
Noah A. Smith
Chiyuan Zhang
MU
ELM
310
149
0
08 Jul 2024
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates
Zeyu Leo Liu
Shrey Pandit
Xi Ye
Eunsol Choi
Greg Durrett
KELM
ALM
405
13
0
08 Jul 2024
Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for Interpreting Neural Networks
Aaron Mueller
CML
228
16
0
05 Jul 2024
Concept Bottleneck Models Without Predefined Concepts
Simon Schrodi
Julian Schur
Max Argus
Thomas Brox
239
16
0
04 Jul 2024
Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity
Lei Yu
Jingcheng Niu
Zining Zhu
Xi Chen
Gerald Penn
214
9
0
04 Jul 2024
Truth is Universal: Robust Detection of Lies in LLMs
Lennart Bürger
Fred Hamprecht
B. Nadler
HILM
237
51
0
03 Jul 2024
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
Bozhong Tian
Xiaozhuan Liang
Siyuan Cheng
Qingbin Liu
Mengru Wang
Dianbo Sui
Xi Chen
Huajun Chen
Xin Xu
MU
225
21
0
02 Jul 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
640
81
0
02 Jul 2024
Why Does New Knowledge Create Messy Ripple Effects in LLMs?
Jiaxin Qin
Zixuan Zhang
Chi Han
Pengfei Yu
Pengfei Yu
KELM
246
19
0
02 Jul 2024
PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models
Kunquan Deng
Zeyu Huang
Chen Li
Chenghua Lin
Min Gao
Wenge Rong
KELM
186
2
0
29 Jun 2024
Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs
Sheridan Feucht
David Atkinson
Byron C. Wallace
David Bau
243
13
0
28 Jun 2024
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models
Renzhi Wang
Piji Li
KELM
CLL
285
13
0
28 Jun 2024
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation
Zijun Yao
Weijian Qi
Liangming Pan
S. Cao
Linmei Hu
Weichuan Liu
Lei Hou
Juanzi Li
RALM
160
22
0
27 Jun 2024
AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image Models
Aishwarya Agarwal
Srikrishna Karanam
Balaji Vasan Srinivasan
253
2
0
27 Jun 2024
The Remarkable Robustness of LLMs: Stages of Inference?
Vedang Lad
Wes Gurnee
Max Tegmark
Max Tegmark
519
87
0
27 Jun 2024
Evaluating Copyright Takedown Methods for Language Models
Boyi Wei
Weijia Shi
Yangsibo Huang
Noah A. Smith
Chiyuan Zhang
Luke Zettlemoyer
Kai Li
Peter Henderson
458
38
0
26 Jun 2024
IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons
Dan Shi
Renren Jin
Shangda Wu
Weilong Dong
Xinwei Wu
Deyi Xiong
249
26
0
26 Jun 2024
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Yibo Jiang
Goutham Rajendran
Pradeep Ravikumar
Bryon Aragam
CLL
KELM
277
12
0
26 Jun 2024
Enhancing Data Privacy in Large Language Models through Private Association Editing
Davide Venditti
Elena Sofia Ruzzetti
Giancarlo A. Xompero
Cristina Giannone
Andrea Favalli
Raniero Romagnoli
Fabio Massimo Zanzotto
KELM
209
7
0
26 Jun 2024
Transformer Normalisation Layers and the Independence of Semantic Subspaces
S. Menary
Samuel Kaski
Andre Freitas
231
2
0
25 Jun 2024
BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning
Ercong Nie
Bo Shao
Zifeng Ding
Mingyang Wang
Helmut Schmid
Hinrich Schütze
KELM
494
12
0
25 Jun 2024
How Well Can Knowledge Edit Methods Edit Perplexing Knowledge?
Huaizhi Ge
Frank Rudzicz
Zining Zhu
KELM
289
4
0
25 Jun 2024
It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension
Sagi Shaier
Lawrence E Hunter
Katharina von der Wense
269
4
0
24 Jun 2024
Multilingual Knowledge Editing with Language-Agnostic Factual Neurons
Xue Zhang
Yunlong Liang
Fandong Meng
Songming Zhang
Yufeng Chen
Jinan Xu
Jie Zhou
KELM
166
14
0
24 Jun 2024
MD tree: a model-diagnostic tree grown on loss landscape
Yefan Zhou
Jianlong Chen
Qinxue Cao
Konstantin Schürholt
Yaoqing Yang
296
2
0
24 Jun 2024
Confidence Regulation Neurons in Language Models
Alessandro Stolfo
Ben Wu
Wes Gurnee
Yonatan Belinkov
Xingyi Song
Mrinmaya Sachan
Neel Nanda
242
39
0
24 Jun 2024
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and Evaluation
Michal Golovanevsky
William Rudman
Vedant Palit
Ritambhara Singh
Carsten Eickhoff
446
10
0
24 Jun 2024
Large Language Models Are Cross-Lingual Knowledge-Free Reasoners
Peng Hu
Sizhe Liu
Changjiang Gao
Xue Han
Xue Han
Junlan Feng
Chao Deng
Shujian Huang
LRM
405
14
0
24 Jun 2024
FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models
Junyi Zhu
Shuochen Liu
Yu Yu
Bo Tang
Yibo Yan
Zhiyu Li
Feiyu Xiong
Tong Xu
Matthew B. Blaschko
210
6
0
23 Jun 2024
Memorizing Documents with Guidance in Large Language Models
Bumjin Park
Jaesik Choi
KELM
RALM
195
1
0
23 Jun 2024
Unveiling LLM Mechanisms Through Neural ODEs and Control Theory
Yukun Zhang
Qi Dong
309
0
0
23 Jun 2024
Beyond the Doors of Perception: Vision Transformers Represent Relations Between Objects
Michael A. Lepori
Alexa R. Tartaglini
Wai Keen Vong
Thomas Serre
Brenden M. Lake
Ellie Pavlick
223
15
0
22 Jun 2024
Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models
Christopher Burger
Yifan Hu
Thai Le
KELM
185
0
0
22 Jun 2024
Steering Without Side Effects: Improving Post-Deployment Control of Language Models
Asa Cooper Stickland
Alexander Lyzhov
Jacob Pfau
Salsabila Mahdi
Samuel R. Bowman
LLMSV
AAML
244
38
0
21 Jun 2024
Towards Understanding Safety Alignment: A Mechanistic Perspective from Safety Neurons
Jianhui Chen
Xiaozhi Wang
Zijun Yao
Yushi Bai
Lei Hou
Juanzi Li
LLMSV
KELM
344
26
0
20 Jun 2024
Previous
1
2
3
...
16
17
18
...
26
27
28
Next