ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.04213
  4. Cited By
Does Localization Inform Editing? Surprising Differences in
  Causality-Based Localization vs. Knowledge Editing in Language Models

Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models

10 January 2023
Peter Hase
Mohit Bansal
Been Kim
Asma Ghandeharioun
    MILM
ArXivPDFHTML

Papers citing "Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models"

43 / 143 papers shown
Title
A Comprehensive Study of Knowledge Editing for Large Language Models
A Comprehensive Study of Knowledge Editing for Large Language Models
Ningyu Zhang
Yunzhi Yao
Bo Tian
Peng Wang
Shumin Deng
...
Lei Liang
Zhiqiang Zhang
Xiao-Jun Zhu
Jun Zhou
Huajun Chen
KELM
26
76
0
02 Jan 2024
PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
Hengrui Gu
Kaixiong Zhou
Xiaotian Han
Ninghao Liu
Ruobing Wang
Xin Wang
LRM
KELM
61
22
0
23 Dec 2023
The Truth is in There: Improving Reasoning in Language Models with
  Layer-Selective Rank Reduction
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Pratyusha Sharma
Jordan T. Ash
Dipendra Kumar Misra
LRM
11
77
0
21 Dec 2023
Neuron-Level Knowledge Attribution in Large Language Models
Neuron-Level Knowledge Attribution in Large Language Models
Zeping Yu
Sophia Ananiadou
FAtt
KELM
16
6
0
19 Dec 2023
Grokking Group Multiplication with Cosets
Grokking Group Multiplication with Cosets
Dashiell Stander
Qinan Yu
Honglu Fan
Stella Biderman
33
9
0
11 Dec 2023
Interpretability Illusions in the Generalization of Simplified Models
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman
Andrew Kyle Lampinen
Lucas Dixon
Danqi Chen
Asma Ghandeharioun
17
14
0
06 Dec 2023
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale
  of Two Benchmarks
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
Ting-Yun Chang
Jesse Thomason
Robin Jia
15
14
0
15 Nov 2023
Trends in Integration of Knowledge and Large Language Models: A Survey
  and Taxonomy of Methods, Benchmarks, and Applications
Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications
Zhangyin Feng
Weitao Ma
Weijiang Yu
Lei Huang
Haotian Wang
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
KELM
21
37
0
10 Nov 2023
Massive Editing for Large Language Models via Meta Learning
Massive Editing for Large Language Models via Meta Learning
Chenmien Tan
Ge Zhang
Jie Fu
KELM
9
29
0
08 Nov 2023
The Expressibility of Polynomial based Attention Scheme
The Expressibility of Polynomial based Attention Scheme
Zhao-quan Song
Guangyi Xu
Junze Yin
27
5
0
30 Oct 2023
A Survey on Knowledge Editing of Neural Networks
A Survey on Knowledge Editing of Neural Networks
Vittorio Mazzia
Alessandro Pedrani
Andrea Caciolai
Kay Rottmann
Davide Bernardi
KELM
12
24
0
30 Oct 2023
Identifying and Adapting Transformer-Components Responsible for Gender
  Bias in an English Language Model
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model
Abhijith Chintam
Rahel Beloch
Willem H. Zuidema
Michael Hanna
Oskar van der Wal
18
16
0
19 Oct 2023
Emptying the Ocean with a Spoon: Should We Edit Models?
Emptying the Ocean with a Spoon: Should We Edit Models?
Yuval Pinter
Michael Elhadad
KELM
20
26
0
18 Oct 2023
Can We Edit Multimodal Large Language Models?
Can We Edit Multimodal Large Language Models?
Siyuan Cheng
Bo Tian
Qingbin Liu
Xi Chen
Yongheng Wang
Huajun Chen
Ningyu Zhang
MLLM
28
28
0
12 Oct 2023
How Do Large Language Models Capture the Ever-changing World Knowledge?
  A Review of Recent Advances
How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances
Zihan Zhang
Meng Fang
Lingxi Chen
Mohammad-Reza Namazi-Rad
Jun Wang
KELM
17
21
0
11 Oct 2023
An Adversarial Example for Direct Logit Attribution: Memory Management
  in gelu-4l
An Adversarial Example for Direct Logit Attribution: Memory Management in gelu-4l
James Dao
Yeu-Tong Lau
Can Rager
Jett Janiak
35
5
0
11 Oct 2023
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
Deniz Bayazit
Negar Foroutan
Zeming Chen
Gail Weiss
Antoine Bosselut
KELM
13
13
0
04 Oct 2023
Editing Personality for Large Language Models
Editing Personality for Large Language Models
Shengyu Mao
Xiaohan Wang
Meng Wang
Yong-jia Jiang
Pengjun Xie
Fei Huang
Ningyu Zhang
KELM
33
9
0
03 Oct 2023
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending
  Against Extraction Attacks
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Vaidehi Patil
Peter Hase
Mohit Bansal
KELM
AAML
18
94
0
29 Sep 2023
Towards Best Practices of Activation Patching in Language Models:
  Metrics and Methods
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods
Fred Zhang
Neel Nanda
LLMSV
26
96
0
27 Sep 2023
A Fast Optimization View: Reformulating Single Layer Attention in LLM
  Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Yeqi Gao
Zhao-quan Song
Weixin Wang
Junze Yin
18
25
0
14 Sep 2023
Memory Injections: Correcting Multi-Hop Reasoning Failures during
  Inference in Transformer-Based Language Models
Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models
Mansi Sakarvadia
Aswathy Ajith
Arham Khan
Daniel Grzenda
Nathaniel Hudson
André Bauer
Kyle Chard
Ian T. Foster
KELM
LRM
17
16
0
11 Sep 2023
Emergent Linear Representations in World Models of Self-Supervised
  Sequence Models
Emergent Linear Representations in World Models of Self-Supervised Sequence Models
Neel Nanda
Andrew Lee
Martin Wattenberg
FAtt
MILM
35
141
0
02 Sep 2023
Journey to the Center of the Knowledge Neurons: Discoveries of
  Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons
Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons
Yuheng Chen
Pengfei Cao
Yubo Chen
Kang Liu
Jun Zhao
KELM
25
41
0
25 Aug 2023
GradientCoin: A Peer-to-Peer Decentralized Large Language Models
GradientCoin: A Peer-to-Peer Decentralized Large Language Models
Yeqi Gao
Zhao-quan Song
Junze Yin
21
18
0
21 Aug 2023
Linearity of Relation Decoding in Transformer Language Models
Linearity of Relation Decoding in Transformer Language Models
Evan Hernandez
Arnab Sen Sharma
Tal Haklay
Kevin Meng
Martin Wattenberg
Jacob Andreas
Yonatan Belinkov
David Bau
KELM
11
82
0
17 Aug 2023
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language
  Models
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models
Peng Wang
Ningyu Zhang
Bo Tian
Zekun Xi
Yunzhi Yao
...
Shuyang Cheng
Kangwei Liu
Yuansheng Ni
Guozhou Zheng
Huajun Chen
KELM
27
41
0
14 Aug 2023
FeedbackLogs: Recording and Incorporating Stakeholder Feedback into
  Machine Learning Pipelines
FeedbackLogs: Recording and Incorporating Stakeholder Feedback into Machine Learning Pipelines
Matthew Barker
Emma Kallina
D. Ashok
Katherine M. Collins
Ashley Casovan
Adrian Weller
Ameet Talwalkar
Valerie Chen
Umang Bhatt
33
5
0
28 Jul 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from
  Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
34
468
0
27 Jul 2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language
  Model
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Kenneth Li
Oam Patel
Fernanda Viégas
Hanspeter Pfister
Martin Wattenberg
KELM
HILM
15
470
0
06 Jun 2023
Editing Common Sense in Transformers
Editing Common Sense in Transformers
Anshita Gupta
Debanjan Mondal
Akshay Krishna Sheshadri
Wenlong Zhao
Xiang Lorraine Li
Sarah Wiegreffe
Niket Tandon
KELM
29
20
0
24 May 2023
Editing Large Language Models: Problems, Methods, and Opportunities
Editing Large Language Models: Problems, Methods, and Opportunities
Yunzhi Yao
Peng Wang
Bo Tian
Shuyang Cheng
Zhoubo Li
Shumin Deng
Huajun Chen
Ningyu Zhang
KELM
30
278
0
22 May 2023
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Wes Gurnee
Neel Nanda
Matthew Pauly
Katherine Harvey
Dmitrii Troitskii
Dimitris Bertsimas
MILM
153
186
0
02 May 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language
  Models
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
189
261
0
28 Apr 2023
Task-Specific Skill Localization in Fine-tuned Language Models
Task-Specific Skill Localization in Fine-tuned Language Models
A. Panigrahi
Nikunj Saunshi
Haoyu Zhao
Sanjeev Arora
MoMe
21
66
0
13 Feb 2023
Editing Language Model-based Knowledge Graph Embeddings
Editing Language Model-based Knowledge Graph Embeddings
Shuyang Cheng
Ningyu Zhang
Bo Tian
Feiyu Xiong
Wei Guo
Huajun Chen
KELM
47
23
0
25 Jan 2023
Disentangled Explanations of Neural Network Predictions by Finding
  Relevant Subspaces
Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces
Pattarawat Chormai
J. Herrmann
Klaus-Robert Muller
G. Montavon
FAtt
43
17
0
30 Dec 2022
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value
  Adaptors
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
Thomas Hartvigsen
S. Sankaranarayanan
Hamid Palangi
Yoon Kim
Marzyeh Ghassemi
KELM
14
143
0
20 Nov 2022
Natural Language Descriptions of Deep Visual Features
Natural Language Descriptions of Deep Visual Features
Evan Hernandez
Sarah Schwettmann
David Bau
Teona Bagashvili
Antonio Torralba
Jacob Andreas
MILM
196
116
0
26 Jan 2022
Editing a classifier by rewriting its prediction rules
Editing a classifier by rewriting its prediction rules
Shibani Santurkar
Dimitris Tsipras
Mahalaxmi Elango
David Bau
Antonio Torralba
A. Madry
KELM
175
89
0
02 Dec 2021
Fast Model Editing at Scale
Fast Model Editing at Scale
E. Mitchell
Charles Lin
Antoine Bosselut
Chelsea Finn
Christopher D. Manning
KELM
219
341
0
21 Oct 2021
Language Models as Knowledge Bases?
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
406
2,576
0
03 Sep 2019
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,150
0
16 Jan 2013
Previous
123