ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.16252
  4. Cited By
Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models
v1v2 (latest)

Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models

22 May 2025
Hwiyeong Lee
Uiji Hwang
Hyelim Lim
Taeuk Kim
    MU
ArXiv (abs)PDFHTMLGithub (4★)

Papers citing "Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models"

18 / 18 papers shown
SoK: Machine Unlearning for Large Language Models
Jie Ren
Yue Xing
Yingqian Cui
Charu C. Aggarwal
Hui Liu
MU
182
2
0
10 Jun 2025
Negative Preference Optimization: From Catastrophic Collapse to
  Effective Unlearning
Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning
Ruiqi Zhang
Licong Lin
Yu Bai
Song Mei
MU
338
322
0
08 Apr 2024
Digital Forgetting in Large Language Models: A Survey of Unlearning
  Methods
Digital Forgetting in Large Language Models: A Survey of Unlearning MethodsArtificial Intelligence Review (Artif Intell Rev), 2024
Alberto Blanco-Justicia
N. Jebreel
Benet Manzanares-Salor
David Sánchez
Josep Domingo-Ferrer
Guillem Collell
Kuan Eeik Tan
KELMMU
336
41
0
02 Apr 2024
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Nathaniel Li
Alexander Pan
Anjali Gopal
Summer Yue
Daniel Berrios
...
Yan Shoshitaishvili
Jimmy Ba
K. Esvelt
Alexandr Wang
Dan Hendrycks
ELM
758
305
0
05 Mar 2024
Eight Methods to Evaluate Robust Unlearning in LLMs
Eight Methods to Evaluate Robust Unlearning in LLMs
Aengus Lynch
Phillip Guo
Aidan Ewart
Stephen Casper
Dylan Hadfield-Menell
ELMMU
340
116
0
26 Feb 2024
TOFU: A Task of Fictitious Unlearning for LLMs
TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini
Zhili Feng
Avi Schwarzschild
Zachary Chase Lipton
J. Zico Kolter
MUCLL
347
318
0
11 Jan 2024
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale
  of Two Benchmarks
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two BenchmarksNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Ting-Yun Chang
Jesse Thomason
Robin Jia
323
26
0
15 Nov 2023
Who's Harry Potter? Approximate Unlearning in LLMs
Who's Harry Potter? Approximate Unlearning in LLMs
Ronen Eldan
M. Russinovich
MUMoMe
441
320
0
03 Oct 2023
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending
  Against Extraction Attacks
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction AttacksInternational Conference on Learning Representations (ICLR), 2023
Vaidehi Patil
Peter Hase
Joey Tianyi Zhou
KELMAAML
302
147
0
29 Sep 2023
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and
  Vulnerabilities
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Maximilian Mozes
Xuanli He
Bennett Kleinberg
Lewis D. Griffin
221
107
0
24 Aug 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward ModelNeural Information Processing Systems (NeurIPS), 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
906
6,769
0
29 May 2023
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Kent K. Chang
Mackenzie Cramer
Sandeep Soni
David Bamman
RALM
591
163
0
28 Apr 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language
  Models
Dissecting Recall of Factual Associations in Auto-Regressive Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
735
420
0
28 Apr 2023
Does Localization Inform Editing? Surprising Differences in
  Causality-Based Localization vs. Knowledge Editing in Language Models
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Peter Hase
Joey Tianyi Zhou
Been Kim
Asma Ghandeharioun
MILM
342
230
0
10 Jan 2023
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Knowledge Unlearning for Mitigating Privacy Risks in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Joel Jang
Dongkeun Yoon
Sohee Yang
Sungmin Cha
Moontae Lee
Lajanugen Logeswaran
Minjoon Seo
KELMPILMMU
490
353
0
04 Oct 2022
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts
  in the Vocabulary Space
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary SpaceConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
KELM
629
462
0
28 Mar 2022
Locating and Editing Factual Associations in GPT
Locating and Editing Factual Associations in GPTNeural Information Processing Systems (NeurIPS), 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
962
1,972
0
10 Feb 2022
Transformer Feed-Forward Layers Are Key-Value Memories
Transformer Feed-Forward Layers Are Key-Value MemoriesConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Mor Geva
R. Schuster
Jonathan Berant
Omer Levy
KELM
622
1,159
0
29 Dec 2020
1
Page 1 of 1