Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.13636
Cited By
Quark: Controllable Text Generation with Reinforced Unlearning
26 May 2022
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quark: Controllable Text Generation with Reinforced Unlearning"
50 / 175 papers shown
Title
Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation
Stefan Vasilev
Christian Herold
Baohao Liao
Seyyed Hadi Hashemi
Shahram Khadivi
Christof Monz
MU
42
0
0
09 May 2025
Teaching Models to Understand (but not Generate) High-risk Data
Ryan Yixiang Wang
Matthew Finlayson
Luca Soldaini
Swabha Swayamdipta
Robin Jia
24
0
0
05 May 2025
TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation
Gwen Yidou Weng
Benjie Wang
Guy Van den Broeck
BDL
33
0
0
25 Apr 2025
Bridging the Gap Between Preference Alignment and Machine Unlearning
Xiaohua Feng
Yuyuan Li
Huwei Ji
Jiaming Zhang
L. Zhang
Tianyu Du
Chaochao Chen
MU
38
0
0
09 Apr 2025
Effective Skill Unlearning through Intervention and Abstention
Yongce Li
Chung-En Sun
Tsui-Wei Weng
MU
51
0
0
27 Mar 2025
ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
Haoming Xu
Shuxun Wang
Yanqiu Zhao
Yi Zhong
Ziyan Jiang
Ningyuan Zhao
Shumin Deng
H. Chen
N. Zhang
MoMe
MU
67
0
0
27 Mar 2025
CLEAR: Contrasting Textual Feedback with Experts and Amateurs for Reasoning
Andrew Rufail
Daniel Kim
Sean O'Brien
Kevin Zhu
LRM
25
0
0
24 Mar 2025
Palette of Language Models: A Solver for Controlled Text Generation
Zhe Yang
Yi Huang
Yaqin Chen
Xiaoting Wu
Junlan Feng
Chao Deng
38
0
0
14 Mar 2025
Rewarding Doubt: A Reinforcement Learning Approach to Confidence Calibration of Large Language Models
Paul Stangel
D. Bani-Harouni
Chantal Pellegrini
Ege Ozsoy
Kamilia Zaripova
Matthias Keicher
Nassir Navab
29
1
0
04 Mar 2025
Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models
Huazheng Wang
Yongcheng Jing
Haifeng Sun
Yingjie Wang
J. Wang
Jianxin Liao
Dacheng Tao
KELM
MU
42
0
0
27 Feb 2025
A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models
Jiahui Geng
Qing Li
Herbert Woisetschlaeger
Zongxiong Chen
Y. Wang
Preslav Nakov
Hans-Arno Jacobsen
Fakhri Karray
MU
41
1
0
22 Feb 2025
Linear Mode Connectivity in Differentiable Tree Ensembles
Ryuichi Kanoh
M. Sugiyama
60
1
0
17 Feb 2025
ReLearn: Unlearning via Learning for Large Language Models
Haoming Xu
Ningyuan Zhao
Liming Yang
Sendong Zhao
Shumin Deng
Mengru Wang
Bryan Hooi
Nay Oo
H. Chen
N. Zhang
KELM
CLL
MU
56
0
0
16 Feb 2025
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Yueqin Yin
Shentao Yang
Yujia Xie
Ziyi Yang
Yuting Sun
Hany Awadalla
Weizhu Chen
Mingyuan Zhou
48
0
0
07 Jan 2025
Recursive Decomposition of Logical Thoughts: Framework for Superior Reasoning and Knowledge Propagation in Large Language Models
Kaleem Ullah Qasim
Jiashu Zhang
Tariq Alsahfi
Ateeq Ur Rehman Butt
LRM
ReLM
61
1
0
03 Jan 2025
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
A. Feder Cooper
Christopher A. Choquette-Choo
Miranda Bogen
Matthew Jagielski
Katja Filippova
...
Abigail Z. Jacobs
Andreas Terzis
Hanna M. Wallach
Nicolas Papernot
Katherine Lee
AILaw
MU
86
10
0
09 Dec 2024
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang
Shengyu Zhang
J. Zhang
Runyi Hu
Xiaoya Li
Tianwei Zhang
Jiwei Li
Fei Wu
G. Wang
Eduard H. Hovy
OffRL
111
6
0
05 Dec 2024
Unified Parameter-Efficient Unlearning for LLMs
Chenlu Ding
Jiancan Wu
Yancheng Yuan
Jinda Lu
Kai Zhang
Alex Su
Xiang Wang
Xiangnan He
MU
KELM
88
6
0
30 Nov 2024
Towards Robust Evaluation of Unlearning in LLMs via Data Transformations
Abhinav Joshi
Shaswati Saha
Divyaksh Shukla
Sriram Vema
Harsh Jhamtani
Manas Gaur
Ashutosh Modi
MU
72
0
0
23 Nov 2024
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
Jinghan Jia
Jiancheng Liu
Yihua Zhang
Parikshit Ram
Nathalie Baracaldo
Sijia Liu
MU
33
2
0
23 Oct 2024
CLEAR: Character Unlearning in Textual and Visual Modalities
Alexey Dontsov
Dmitrii Korzh
Alexey Zhavoronkin
Boris Mikheev
Denis Bobkov
Aibek Alanov
Oleg Y. Rogov
Ivan V. Oseledets
Elena Tutubalina
AILaw
VLM
MU
50
5
0
23 Oct 2024
Guaranteed Generation from Large Language Models
Minbeom Kim
Thibaut Thonet
Jos Rozen
Hwaran Lee
Kyomin Jung
Marc Dymetman
21
1
0
09 Oct 2024
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
Chongyu Fan
Jiancheng Liu
Licong Lin
Jinghan Jia
Ruiqi Zhang
Song Mei
Sijia Liu
MU
39
15
0
09 Oct 2024
Large Language Models can be Strong Self-Detoxifiers
Ching-Yun Ko
Pin-Yu Chen
Payel Das
Youssef Mroueh
Soham Dan
Georgios Kollias
Subhajit Chaudhury
Tejaswini Pedapati
Luca Daniel
18
2
0
04 Oct 2024
Erasing Conceptual Knowledge from Language Models
Rohit Gandikota
Sheridan Feucht
Samuel Marks
David Bau
KELM
ELM
MU
40
5
0
03 Oct 2024
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models
Shayekh Bin Islam
Md Asib Rahman
K S M Tozammel Hossain
Enamul Hoque
Shafiq R. Joty
Md. Rizwan Parvez
RALM
AIFin
LRM
VLM
29
12
0
02 Oct 2024
Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive Prompts
Tingchen Fu
Yupeng Hou
Julian McAuley
Rui Yan
22
3
0
09 Aug 2024
Towards Aligning Language Models with Textual Feedback
Sauc Abadal Lloret
S. Dhuliawala
K. Murugesan
Mrinmaya Sachan
VLM
33
1
0
24 Jul 2024
Establishing Knowledge Preference in Language Models
Sizhe Zhou
Sha Li
Yu Meng
Yizhu Jiao
Heng Ji
Jiawei Han
KELM
28
0
0
17 Jul 2024
Learning to Refuse: Towards Mitigating Privacy Risks in LLMs
Zhenhua Liu
Tong Zhu
Chuanyuan Tan
Wenliang Chen
PILM
MU
32
8
0
14 Jul 2024
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
Weijia Shi
Jaechan Lee
Yangsibo Huang
Sadhika Malladi
Jieyu Zhao
Ari Holtzman
Daogao Liu
Luke Zettlemoyer
Noah A. Smith
Chiyuan Zhang
MU
ELM
40
36
0
08 Jul 2024
Orchestrating LLMs with Different Personalizations
Jin Peng Zhou
Katie Z Luo
Jingwen Gu
Jason Yuan
Kilian Q. Weinberger
Wen Sun
38
2
0
04 Jul 2024
On the Transformations across Reward Model, Parameter Update, and In-Context Prompt
Deng Cai
Huayang Li
Tingchen Fu
Siheng Li
Weiwen Xu
...
Leyang Cui
Yan Wang
Lemao Liu
Taro Watanabe
Shuming Shi
KELM
26
2
0
24 Jun 2024
CAVE: Controllable Authorship Verification Explanations
Sahana Ramnath
Kartik Pandey
Elizabeth Boschee
Xiang Ren
45
1
0
24 Jun 2024
Multi-Objective Linguistic Control of Large Language Models
Dang Nguyen
Jiuhai Chen
Tianyi Zhou
32
0
0
23 Jun 2024
MU-Bench: A Multitask Multimodal Benchmark for Machine Unlearning
Jiali Cheng
Hadi Amiri
BDL
33
3
0
21 Jun 2024
MACAROON: Training Vision-Language Models To Be Your Engaged Partners
Shujin Wu
Yi Ren Fung
Sha Li
Yixin Wan
Kai-Wei Chang
Heng Ji
28
5
0
20 Jun 2024
Towards Minimal Targeted Updates of Language Models with Targeted Negative Training
Lily H. Zhang
Rajesh Ranganath
Arya Tafvizi
22
1
0
19 Jun 2024
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models
Jie Chen
Yupeng Zhang
Bingning Wang
Wayne Xin Zhao
Ji-Rong Wen
Weipeng Chen
SyDa
27
4
0
18 Jun 2024
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs
S. Kadhe
Farhan Ahmed
Dennis Wei
Nathalie Baracaldo
Inkit Padhi
MoMe
MU
21
5
0
17 Jun 2024
Style Transfer with Multi-iteration Preference Optimization
Shuai Liu
Jonathan May
24
3
0
17 Jun 2024
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models
Zhuoran Jin
Pengfei Cao
Chenhao Wang
Zhitao He
Hongbang Yuan
Jiachun Li
Yubo Chen
Kang Liu
Jun Zhao
KELM
MU
31
12
0
16 Jun 2024
A More Practical Approach to Machine Unlearning
David Zagardo
MU
27
0
0
13 Jun 2024
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
Hamish Ivison
Yizhong Wang
Jiacheng Liu
Zeqiu Wu
Valentina Pyatkin
Nathan Lambert
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
26
38
0
13 Jun 2024
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF
Taiming Lu
Lingfeng Shen
Xinyu Yang
Weiting Tan
Beidi Chen
Huaxiu Yao
28
2
0
12 Jun 2024
Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas
Chengyuan Deng
Yiqun Duan
Xin Jin
Heng Chang
Yijun Tian
...
Kuofeng Gao
Sihong He
Jun Zhuang
Lu Cheng
Haohan Wang
AILaw
38
14
0
08 Jun 2024
RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning Personal Information in Large Language Models
Bichen Wang
Yuzhe Zi
Yixin Sun
Yanyan Zhao
Bing Qin
MU
58
8
0
04 Jun 2024
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways
Zehang Deng
Yongjian Guo
Changzhou Han
Wanlun Ma
Junwu Xiong
Sheng Wen
Yang Xiang
34
19
0
04 Jun 2024
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee
Sue Hyun Park
Seungone Kim
Minjoon Seo
ALM
14
35
0
28 May 2024
Large Scale Knowledge Washing
Yu-Xiang Wang
Ruihan Wu
Zexue He
X. Chen
Julian McAuley
MU
KELM
35
4
0
26 May 2024
1
2
3
4
Next