Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2202.05262
Cited By
v1
v2
v3
v4
v5 (latest)
Locating and Editing Factual Associations in GPT
Neural Information Processing Systems (NeurIPS), 2022
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Locating and Editing Factual Associations in GPT"
50 / 1,361 papers shown
MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models
Kaichen Huang
Jiahao Huo
Yibo Yan
Kun Wang
Yutao Yue
Xuming Hu
244
2
0
07 Oct 2024
OD-Stega: LLM-Based Near-Imperceptible Steganography via Optimized Distributions
Yu-Shin Huang
Peter Just
Krishna Narayanan
Chao Tian
274
15
0
06 Oct 2024
Evaluating Language Model Character Traits
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Francis Rhys Ward
Zejia Yang
Alex Jackson
Randy Brown
Chandler Smith
Grace Colverd
Louis Thomson
Raymond Douglas
Patrik Bartak
Andrew Rowan
138
0
0
05 Oct 2024
Neuron-Level Sequential Editing for Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Houcheng Jiang
Cunchun Li
Tianyu Zhang
An Zhang
Ruipeng Wang
Tao Liang
Xiang Wang
KELM
233
11
0
05 Oct 2024
Understanding Reasoning in Chain-of-Thought from the Hopfieldian View
Lijie Hu
Liang Liu
Shu Yang
Xin Chen
Zhen Tan
Muhammad Asif Ali
Mengdi Li
Di Wang
LRM
301
9
0
04 Oct 2024
How Language Models Prioritize Contextual Grammatical Cues?
BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024
Hamidreza Amirzadeh
Afra Alishahi
Hosein Mohebbi
181
0
0
04 Oct 2024
RIPPLECOT: Amplifying Ripple Effect of Knowledge Editing in Language Models via Chain-of-Thought In-Context Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zihao Zhao
Yuchen Yang
Yijiang Li
Yinzhi Cao
LRM
KELM
173
6
0
04 Oct 2024
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation
Aurick Qiao
Z. Yao
Samyam Rajbhandari
Yuxiong He
VLM
343
6
0
04 Oct 2024
Fine-Tuning Language Models with Differential Privacy through Adaptive Noise Allocation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Xianzhi Li
Ran Zmigrod
Zhiqiang Ma
Xiaomo Liu
Xiaodan Zhu
197
12
0
03 Oct 2024
HiddenGuard: Fine-Grained Safe Generation with Specialized Representation Router
Lingrui Mei
Shenghua Liu
Yiwei Wang
Baolong Bi
Ruibin Yuan
Xueqi Cheng
254
8
0
03 Oct 2024
Defining Knowledge: Bridging Epistemology and Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Constanza Fierro
Ruchira Dhar
Filippos Stamatiou
Nicolas Garneau
Anders Søgaard
KELM
335
10
0
03 Oct 2024
Meta-Models: An Architecture for Decoding LLM Behaviors Through Interpreted Embeddings and Natural Language
Anthony Costarelli
Mat Allen
Severin Field
274
5
0
03 Oct 2024
Better Call SAUL: Fluent and Consistent Language Model Editing with Generation Regularization
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Mingyang Wang
Lukas Lange
Heike Adel
Jannik Strötgen
Hinrich Schütze
KELM
203
3
0
03 Oct 2024
Mitigating Memorization In Language Models
Mansi Sakarvadia
Aswathy Ajith
Arham Khan
Nathaniel Hudson
Caleb Geniesse
Kyle Chard
Yaoqing Yang
Ian Foster
Michael W. Mahoney
KELM
MU
331
8
0
03 Oct 2024
FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs
Deema Alnuhait
Neeraja Kirtane
Muhammad Khalifa
Hao Peng
LRM
HILM
370
7
0
03 Oct 2024
Erasing Conceptual Knowledge from Language Models
Rohit Gandikota
Sheridan Feucht
Samuel Marks
David Bau
ELM
KELM
MU
443
20
0
03 Oct 2024
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
International Conference on Learning Representations (ICLR), 2024
Nick Jiang
Anish Kachinthaya
Suzie Petryk
Yossi Gandelsman
VLM
412
62
0
03 Oct 2024
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
International Conference on Learning Representations (ICLR), 2024
Hadas Orgad
Michael Toker
Zorik Gekhman
Roi Reichart
Idan Szpektor
Hadas Kotek
Yonatan Belinkov
HILM
AIFin
697
114
0
03 Oct 2024
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Xinhao Yao
Hongjin Qian
Xiaolin Hu
Gengze Xu
Wei Liu
Jian Luan
Bin Wang
Wenshu Fan
434
6
0
03 Oct 2024
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models
International Conference on Learning Representations (ICLR), 2024
Cunchun Li
Houcheng Jiang
Kun Wang
Yunshan Ma
Shi Jie
Xiangnan He
Tat-Seng Chua
Tat-seng Chua
KELM
524
135
0
03 Oct 2024
Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models
International Conference on Learning Representations (ICLR), 2024
Guobin Shen
Dongcheng Zhao
Yiting Dong
Xiang He
Yi Zeng
AAML
336
11
0
03 Oct 2024
Question-guided Knowledge Graph Re-scoring and Injection for Knowledge Graph Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yu Zhang
Kehai Chen
Xuefeng Bai
zhao kang
Quanjiang Guo
Min Zhang
301
16
0
02 Oct 2024
Mitigating Copy Bias in In-Context Learning through Neuron Pruning
Ameen Ali
Lior Wolf
Ivan Titov
195
6
0
02 Oct 2024
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Philipp Mondorf
Sondre Wold
Yun Xue
492
2
0
02 Oct 2024
Skill Path: Unveiling Language Skills from Circuit Graphs
Hang Chen
Jiaying Zhu
Xinyu Yang
Wenya Wang
160
0
0
02 Oct 2024
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
International Conference on Learning Representations (ICLR), 2024
Jiyeon Kim
Hyunji Lee
Hyowon Cho
Joel Jang
Hyeonbin Hwang
Seungpil Won
Youbin Ahn
Dohaeng Lee
Minjoon Seo
KELM
1.0K
13
0
02 Oct 2024
Do Music Generation Models Encode Music Theory?
International Society for Music Information Retrieval Conference (ISMIR), 2024
Megan Wei
Michael Freeman
Chris Donahue
Chen Sun
MGen
193
7
0
01 Oct 2024
Quantifying reliance on external information over parametric knowledge during Retrieval Augmented Generation (RAG) using mechanistic analysis
Reshmi Ghosh
Rahul Seetharaman
Hitesh Wadhwa
Somyaa Aggarwal
Samyadeep Basu
Soundararajan Srinivasan
Wenlong Zhao
Shreyas Chaudhari
Ehsan Aghazadeh
103
2
0
01 Oct 2024
UniAdapt: A Universal Adapter for Knowledge Calibration
Tai D. Nguyen
Long H. Pham
Jun Sun
KELM
172
1
0
01 Oct 2024
Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Shota Takashiro
Takeshi Kojima
Andrew Gambardella
Qi Cao
Yusuke Iwasawa
Y. Matsuo
CLL
MU
KELM
119
4
0
01 Oct 2024
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration
Neural Information Processing Systems (NeurIPS), 2024
Kaihang Pan
Zhaoyu Fan
Juncheng Li
Qifan Yu
Hao Fei
Siliang Tang
Richang Hong
Hanwang Zhang
Qianru Sun
KELM
344
17
0
30 Sep 2024
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
International Conference on Learning Representations (ICLR), 2024
Haiyan Zhao
Heng Zhao
Bo Shen
Ali Payani
Fan Yang
Mengnan Du
425
16
0
30 Sep 2024
Transforming Hidden States into Binary Semantic Features
Tomáš Musil
David Marecek
OffRL
134
0
0
29 Sep 2024
Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement
Neural Information Processing Systems (NeurIPS), 2024
Zhehao Huang
Xinwen Cheng
Jinghao Zheng
Haoran Wang
Zhengbao He
Tao Li
Xiaolin Huang
MU
247
25
0
29 Sep 2024
Identifying Knowledge Editing Types in Large Language Models
Xiaopeng Li
Shasha Li
Shezheng Song
Shezheng Song
Bin Ji
Shan Zhao
Jun Ma
Jie Yu
KELM
335
2
0
29 Sep 2024
Crafting Personalized Agents through Retrieval-Augmented Generation on Editable Memory Graphs
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zheng Wang
Zhongyang Li
Zeren Jiang
Dandan Tu
Wei Shi
218
14
0
28 Sep 2024
Localizing Memorization in SSL Vision Encoders
Neural Information Processing Systems (NeurIPS), 2024
Wenhao Wang
Adam Dziedzic
Michael Backes
Franziska Boenisch
259
6
0
27 Sep 2024
"Why" Has the Least Side Effect on Model Editing
Tsung-Hsuan Pan
Chung-Chi Chen
Hen-Hsen Huang
Hsin-Hsi Chen
KELM
128
1
0
27 Sep 2024
SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated Learning
IEEE Transactions on Dependable and Secure Computing (IEEE TDSC), 2024
Minyeong Choe
Cheolhee Park
Changho Seo
Hyunil Kim
AAML
FedML
SILM
275
4
0
23 Sep 2024
Investigating Layer Importance in Large Language Models
BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024
Yang Zhang
Yanfei Dong
Kenji Kawaguchi
FAtt
244
26
0
22 Sep 2024
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
David Chanin
James Wilken-Smith
Tomáš Dulka
Hardik Bhatnagar
Joseph Bloom
Joseph Isaac Bloom
541
74
0
22 Sep 2024
Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zeping Yu
Sophia Ananiadou
LRM
MILM
283
26
0
21 Sep 2024
Uncovering Latent Chain of Thought Vectors in Language Models
Jason Zhang
Scott Viteri
LLMSV
LRM
458
8
0
21 Sep 2024
Co-occurrence is not Factual Association in Language Models
Neural Information Processing Systems (NeurIPS), 2024
Xiao Zhang
Chenyi Guo
Ji Wu
KELM
413
8
0
21 Sep 2024
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELM
CLL
995
8
0
20 Sep 2024
LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models
Akshaj Kumar Veldanda
Shi-Xiong Zhang
Anirban Das
Supriyo Chakraborty
Stephen Rawls
Sambit Sahu
Milind Naphade
KELM
MU
260
4
0
19 Sep 2024
Pay Attention to What Matters
Pedro Luiz Silva
Antonio De Domenico
Ali Maatouk
Fadhel Ayed
ALM
154
1
0
19 Sep 2024
MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for Arabic Language
International Conference on Computational Linguistics (COLING), 2024
Muhammad Asif Ali
Nawal Daftardar
Mutayyaba Waheed
Jianbin Qin
Di Wang
KELM
273
9
0
18 Sep 2024
StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge Editing for Large Language Models
Baolong Bi
Shenghua Liu
Yiwei Wang
Lingrui Mei
Hongcheng Gao
Junfeng Fang
Xueqi Cheng
KELM
215
10
0
16 Sep 2024
Householder Pseudo-Rotation: A Novel Approach to Activation Editing in LLMs with Direction-Magnitude Perspective
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Van-Cuong Pham
Thien Huu Nguyen
LLMSV
222
11
0
16 Sep 2024
Previous
1
2
3
...
14
15
16
...
26
27
28
Next