Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.13636
Cited By
Quark: Controllable Text Generation with Reinforced Unlearning
26 May 2022
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quark: Controllable Text Generation with Reinforced Unlearning"
50 / 175 papers shown
Title
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models
George-Octavian Barbulescu
Peter Triantafillou
MU
27
16
0
06 May 2024
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Fahim Tajwar
Anika Singh
Archit Sharma
Rafael Rafailov
Jeff Schneider
Tengyang Xie
Stefano Ermon
Chelsea Finn
Aviral Kumar
20
103
0
22 Apr 2024
U Can't Gen This? A Survey of Intellectual Property Protection Methods for Data in Generative AI
Tanja Sarcevic
Alicja Karlowicz
Rudolf Mayer
Ricardo A. Baeza-Yates
Andreas Rauber
40
3
0
22 Apr 2024
RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation
Chi-Min Chan
Chunpu Xu
Ruibin Yuan
Hongyin Luo
Wei Xue
Yi-Ting Guo
Jie Fu
RALM
18
18
0
31 Mar 2024
CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment
Feiteng Fang
Liang Zhu
Min Yang
Xi Feng
Jinchang Hou
Qixuan Zhao
Chengming Li
Xiping Hu
Ruifeng Xu
19
0
0
25 Mar 2024
The Frontier of Data Erasure: Machine Unlearning for Large Language Models
Youyang Qu
Ming Ding
Nan Sun
Kanchana Thilakarathna
Tianqing Zhu
Dusit Niyato
MU
23
12
0
23 Mar 2024
Multi-Review Fusion-in-Context
Aviv Slobodkin
Ori Shapira
Ran Levy
Ido Dagan
27
1
0
22 Mar 2024
Dynamic Reward Adjustment in Multi-Reward Reinforcement Learning for Counselor Reflection Generation
Do June Min
Verónica Pérez-Rosas
Kenneth Resnicow
Rada Mihalcea
OffRL
33
2
0
20 Mar 2024
Threats, Attacks, and Defenses in Machine Unlearning: A Survey
Ziyao Liu
Huanyi Ye
Chen Chen
Yongsen Zheng
K. Lam
AAML
MU
29
28
0
20 Mar 2024
Reinforcement Learning with Token-level Feedback for Controllable Text Generation
Wendi Li
Wei Wei
Kaihe Xu
Wenfeng Xie
Dangyang Chen
Yu Cheng
22
7
0
18 Mar 2024
HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback
Ang Li
Qiugen Xiao
Peng Cao
Jian Tang
Yi Yuan
...
Weidong Guo
Yukang Gan
Jeffrey Xu Yu
D. Wang
Ying Shan
VLM
ALM
25
10
0
13 Mar 2024
Authorship Style Transfer with Policy Optimization
Shuai Liu
Shantanu Agarwal
Jonathan May
27
5
0
12 Mar 2024
Teaching Large Language Models to Reason with Reinforcement Learning
Alex Havrilla
Yuqing Du
Sharath Chandra Raparthy
Christoforos Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Sainbayar Sukhbaatar
Roberta Raileanu
ReLM
LRM
29
67
0
07 Mar 2024
Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization
Shitong Duan
Xiaoyuan Yi
Peng Zhang
T. Lu
Xing Xie
Ning Gu
19
4
0
06 Mar 2024
Eight Methods to Evaluate Robust Unlearning in LLMs
Aengus Lynch
Phillip Guo
Aidan Ewart
Stephen Casper
Dylan Hadfield-Menell
ELM
MU
35
55
0
26 Feb 2024
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
Runlong Zhou
Simon S. Du
Beibin Li
OffRL
20
3
0
20 Feb 2024
Direct Preference Optimization with an Offset
Afra Amini
Tim Vieira
Ryan Cotterell
68
54
0
16 Feb 2024
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Rui Yang
Xiaoman Pan
Feng Luo
Shuang Qiu
Han Zhong
Dong Yu
Jianshu Chen
89
65
0
15 Feb 2024
Rethinking Machine Unlearning for Large Language Models
Sijia Liu
Yuanshun Yao
Jinghan Jia
Stephen Casper
Nathalie Baracaldo
...
Hang Li
Kush R. Varshney
Mohit Bansal
Sanmi Koyejo
Yang Liu
AILaw
MU
63
79
0
13 Feb 2024
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Xing-ming Guo
Fangxu Yu
Huan Zhang
Lianhui Qin
Bin Hu
AAML
109
69
0
13 Feb 2024
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
Shentao Yang
Tianqi Chen
Mingyuan Zhou
EGVM
18
22
0
13 Feb 2024
Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models
Lingzhi Wang
Xingshan Zeng
Jinsong Guo
Kam-Fai Wong
Georg Gottlob
MU
AAML
KELM
11
6
0
08 Feb 2024
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Gaurav Pandey
Yatin Nandwani
Tahira Naseem
Mayank Mishra
Guangxuan Xu
Dinesh Raghu
Sachindra Joshi
Asim Munawar
Ramón Fernández Astudillo
BDL
29
3
0
04 Feb 2024
Self-Debiasing Large Language Models: Zero-Shot Recognition and Reduction of Stereotypes
Isabel O. Gallegos
Ryan A. Rossi
Joe Barrow
Md Mehrab Tanjim
Tong Yu
Hanieh Deilamsalehy
Ruiyi Zhang
Sungchul Kim
Franck Dernoncourt
8
19
0
03 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
153
437
0
02 Feb 2024
Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents
Zelong Li
Wenyue Hua
Hao Wang
He Zhu
Yongfeng Zhang
LLMAG
59
18
0
01 Feb 2024
The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts
Lingfeng Shen
Weiting Tan
Sihao Chen
Yunmo Chen
Jingyu Zhang
Haoran Xu
Boyuan Zheng
Philipp Koehn
Daniel Khashabi
11
23
0
23 Jan 2024
Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation
Meng Cao
Lei Shu
Lei Yu
Yun Zhu
Nevan Wichers
Yinxiao Liu
Lei Meng
OffRL
ALM
6
4
0
14 Jan 2024
TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini
Zhili Feng
Avi Schwarzschild
Zachary Chase Lipton
J. Zico Kolter
MU
CLL
35
141
0
11 Jan 2024
Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint
Zhipeng Chen
Kun Zhou
Wayne Xin Zhao
Junchen Wan
Fuzheng Zhang
Di Zhang
Ji-Rong Wen
KELM
31
32
0
11 Jan 2024
Benchmarking Large Language Models on Controllable Generation under Diversified Instructions
Yihan Chen
Benfeng Xu
Quan Wang
Yi Liu
Zhendong Mao
ALM
ELM
11
26
0
01 Jan 2024
Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding
Lifu Tu
Semih Yavuz
Jin Qu
Jiacheng Xu
Rui Meng
Caiming Xiong
Yingbo Zhou
9
1
0
11 Dec 2023
NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation
Peter West
Ronan Le Bras
Taylor Sorensen
Bill Yuchen Lin
Liwei Jiang
...
Khyathi Raghavi Chandu
Jack Hessel
Ashutosh Baheti
Chandra Bhagavatula
Yejin Choi
VLM
8
10
0
10 Dec 2023
ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference
Tianchi Cai
Xierui Song
Jiyan Jiang
Fei Teng
Jinjie Gu
Guannan Zhang
ALM
6
4
0
05 Dec 2023
Tackling Bias in Pre-trained Language Models: Current Trends and Under-represented Societies
Vithya Yogarajan
Gillian Dobbie
Te Taka Keegan
R. Neuwirth
ALM
31
11
0
03 Dec 2023
Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language
Di Jin
Shikib Mehri
Devamanyu Hazarika
Aishwarya Padmakumar
Sungjin Lee
Yang Liu
Mahdi Namazifar
ALM
8
4
0
24 Nov 2023
Case Repositories: Towards Case-Based Reasoning for AI Alignment
K. J. Kevin Feng
Quan Ze Chen
Inyoung Cheong
King Xia
Amy X. Zhang
17
5
0
18 Nov 2023
MultiDelete for Multimodal Machine Unlearning
Jiali Cheng
Hadi Amiri
MU
33
7
0
18 Nov 2023
DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
Yangyi Chen
Karan Sikka
Michael Cogswell
Heng Ji
Ajay Divakaran
19
56
0
16 Nov 2023
Workflow-Guided Response Generation for Task-Oriented Dialogue
Do June Min
Paloma Sodhi
Ramya Ramakrishnan
24
0
0
14 Nov 2023
Controlled Text Generation for Black-box Language Models via Score-based Progressive Editor
Sangwon Yu
Changmin Lee
Hojin Lee
Sungroh Yoon
22
0
0
13 Nov 2023
STEER: Unified Style Transfer with Expert Reinforcement
Skyler Hallinan
Faeze Brahman
Ximing Lu
Jaehun Jung
Sean Welleck
Yejin Choi
OffRL
11
14
0
13 Nov 2023
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
Sheng Liu
Haotian Ye
Lei Xing
James Y. Zou
18
83
0
11 Nov 2023
Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment
Geyang Guo
Ranchi Zhao
Tianyi Tang
Wayne Xin Zhao
Ji-Rong Wen
ALM
16
27
0
07 Nov 2023
Successor Features for Efficient Multisubject Controlled Text Generation
Mengyao Cao
Mehdi Fatemi
Jackie Chi Kit Cheung
Samira Shabanian
BDL
13
0
0
03 Nov 2023
Vanishing Gradients in Reinforcement Finetuning of Language Models
Noam Razin
Hattie Zhou
Omid Saremi
Vimal Thilak
Arwen Bradley
Preetum Nakkiran
Josh Susskind
Etai Littwin
8
7
0
31 Oct 2023
Learning From Mistakes Makes LLM Better Reasoner
Shengnan An
Zexiong Ma
Zeqi Lin
Nanning Zheng
Jian-Guang Lou
Weizhu Chen
LRM
19
73
0
31 Oct 2023
Reward Finetuning for Faster and More Accurate Unsupervised Object Discovery
Katie Z Luo
Zhenzhen Liu
Xiangyu Chen
Yurong You
Sagie Benaim
Cheng Perng Phoo
Mark E. Campbell
Wen Sun
B. Hariharan
Kilian Q. Weinberger
OffRL
14
4
0
29 Oct 2023
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
Hejie Cui
Xinyu Fang
Zihan Zhang
Ran Xu
Xuan Kan
Xin Liu
Yue Yu
Manling Li
Yangqiu Song
Carl Yang
VLM
13
2
0
28 Oct 2023
N-Critics: Self-Refinement of Large Language Models with Ensemble of Critics
Sajad Mousavi
Ricardo Luna Gutierrez
Desik Rengarajan
Vineet Gundecha
Ashwin Ramesh Babu
Avisek Naug
Antonio Guillen-Perez
S. Sarkar
LRM
HILM
KELM
16
6
0
28 Oct 2023
Previous
1
2
3
4
Next