Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.08787
Cited By
Rethinking Machine Unlearning for Large Language Models
13 February 2024
Sijia Liu
Yuanshun Yao
Jinghan Jia
Stephen Casper
Nathalie Baracaldo
Peter Hase
Yuguang Yao
Chris Liu
Xiaojun Xu
Hang Li
Kush R. Varshney
Mohit Bansal
Sanmi Koyejo
Yang Liu
AILaw
MU
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Machine Unlearning for Large Language Models"
12 / 12 papers shown
Title
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Zhiqi Bu
Xiaomeng Jin
Bhanukiran Vinzamuri
Anil Ramakrishna
Kai-Wei Chang
V. Cevher
Mingyi Hong
MU
41
6
0
29 Oct 2024
RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning Personal Information in Large Language Models
Bichen Wang
Yuzhe Zi
Yixin Sun
Yanyan Zhao
Bing Qin
MU
48
2
0
04 Jun 2024
What makes unlearning hard and what to do about it
Kairan Zhao
M. Kurmanji
George-Octavian Barbulescu
Eleni Triantafillou
Peter Triantafillou
MU
48
4
0
03 Jun 2024
Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning
Ruiqi Zhang
Licong Lin
Yu Bai
Song Mei
MU
43
63
0
08 Apr 2024
Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges
Nianwen Si
Hao Zhang
Heyu Chang
Wenlin Zhang
Dan Qu
Weiqiang Zhang
KELM
MU
47
26
0
27 Nov 2023
Adversarial Training Should Be Cast as a Non-Zero-Sum Game
Alexander Robey
Fabian Latorre
George J. Pappas
Hamed Hassani
V. Cevher
AAML
48
8
0
19 Jun 2023
Boundary Unlearning
Min Chen
Weizhuo Gao
Gaoyang Liu
Kai Peng
Chen Wang
MU
58
41
0
21 Mar 2023
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Joel Jang
Dongkeun Yoon
Sohee Yang
Sungmin Cha
Moontae Lee
Lajanugen Logeswaran
Minjoon Seo
KELM
PILM
MU
102
110
0
04 Oct 2022
A Survey of Machine Unlearning
Thanh Tam Nguyen
T. T. Huynh
Phi Le Nguyen
Alan Wee-Chung Liew
Hongzhi Yin
Quoc Viet Hung Nguyen
MU
57
150
0
06 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
273
8,441
0
04 Mar 2022
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
243
1,386
0
14 Dec 2020
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
187
390
0
25 Sep 2019
1