Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.07163
Cited By
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
9 October 2024
Chongyu Fan
Jiancheng Liu
Licong Lin
Jinghan Jia
Ruiqi Zhang
Song Mei
Sijia Liu
MU
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning"
9 / 9 papers shown
Title
Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation
Stefan Vasilev
Christian Herold
Baohao Liao
Seyyed Hadi Hashemi
Shahram Khadivi
Christof Monz
MU
35
0
0
09 May 2025
A mean teacher algorithm for unlearning of language models
Yegor Klochkov
MU
58
0
0
18 Apr 2025
SAEs
Can
\textit{Can}
Can
Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs
Aashiq Muhamed
Jacopo Bonato
Mona Diab
Virginia Smith
MU
37
0
0
11 Apr 2025
Bridging the Gap Between Preference Alignment and Machine Unlearning
Xiaohua Feng
Yuyuan Li
Huwei Ji
Jiaming Zhang
L. Zhang
Tianyu Du
Chaochao Chen
MU
35
0
0
09 Apr 2025
Understanding Machine Unlearning Through the Lens of Mode Connectivity
Jiali Cheng
Hadi Amiri
MU
30
0
0
08 Apr 2025
Not All Data Are Unlearned Equally
Aravind Krishnan
Siva Reddy
Marius Mosbach
MU
36
0
0
07 Apr 2025
Safety Mirage: How Spurious Correlations Undermine VLM Safety Fine-tuning
Yiwei Chen
Yuguang Yao
Yihua Zhang
Bingquan Shen
Gaowen Liu
Sijia Liu
AAML
MU
52
1
0
14 Mar 2025
Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models
Huazheng Wang
Yongcheng Jing
Haifeng Sun
Yingjie Wang
J. Wang
Jianxin Liao
Dacheng Tao
KELM
MU
42
0
0
27 Feb 2025
A General Framework to Enhance Fine-tuning-based LLM Unlearning
J. Ren
Zhenwei Dai
X. Tang
Hui Liu
Jingying Zeng
...
R. Goutam
Suhang Wang
Yue Xing
Qi He
Hui Liu
MU
91
1
0
25 Feb 2025
1