Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.10719
Cited By
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
16 April 2024
Shusheng Xu
Wei Fu
Jiaxuan Gao
Wenjie Ye
Weiling Liu
Zhiyu Mei
Guangju Wang
Chao Yu
Yi Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study"
50 / 102 papers shown
Title
InfoPO: On Mutual Information Maximization for Large Language Model Alignment
Teng Xiao
Zhen Ge
Sujay Sanghavi
Tian Wang
Julian Katz-Samuels
Marc Versage
Qingjun Cui
Trishul M. Chilimbi
12
0
0
13 May 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
31
0
0
05 May 2025
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
Tianjian Li
Daniel Khashabi
50
0
0
05 May 2025
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Jiancong Xiao
Bojian Hou
Zhanliang Wang
Ruochen Jin
Q. Long
Weijie Su
Li Shen
28
0
0
04 May 2025
LookAlike: Consistent Distractor Generation in Math MCQs
Nisarg Parikh
Nigel Fernandez
Alexander Scarlatos
Simon Woodhead
Andrew S. Lan
37
0
0
03 May 2025
Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs
Marina Sakharova
Abhinav Anand
Mira Mezini
44
0
0
21 Apr 2025
Direct Advantage Regression: Aligning LLMs with Online AI Reward
Li He
He Zhao
Stephen Wan
Dadong Wang
Lina Yao
Tongliang Liu
24
0
0
19 Apr 2025
Energy-Based Reward Models for Robust Language Model Alignment
Anamika Lochab
Ruqi Zhang
44
0
0
17 Apr 2025
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Shuai Zhao
Linchao Zhu
Yi Yang
34
1
0
14 Apr 2025
A Domain-Based Taxonomy of Jailbreak Vulnerabilities in Large Language Models
Carlos Peláez-González
Andrés Herrera-Poyatos
Cristina Zuheros
David Herrera-Poyatos
Virilo Tejedor
F. Herrera
AAML
19
0
0
07 Apr 2025
On Data Synthesis and Post-training for Visual Abstract Reasoning
Ke Zhu
Y. Wang
Jiangjiang Liu
Qunyi Xie
Shanshan Liu
Gang Zhang
SyDa
LRM
40
0
0
02 Apr 2025
Entropy-Based Adaptive Weighting for Self-Training
Xiaoxuan Wang
Yihe Deng
Mingyu Derek Ma
Wei Wang
LRM
45
0
0
31 Mar 2025
Optimizing Language Models for Inference Time Objectives using Reinforcement Learning
Yunhao Tang
Kunhao Zheng
Gabriel Synnaeve
Rémi Munos
34
0
0
25 Mar 2025
RL-finetuning LLMs from on- and off-policy data with a single algorithm
Yunhao Tang
Taco Cohen
David W. Zhang
Michal Valko
Rémi Munos
OffRL
39
1
0
25 Mar 2025
MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization
Hengjia Li
Lifan Jiang
Xi Xiao
Tianyang Wang
Hongwei Yi
Boxi Wu
D. Cai
VGen
43
0
0
16 Mar 2025
From Demonstrations to Rewards: Alignment Without Explicit Human Preferences
Siliang Zeng
Yao Liu
Huzefa Rangwala
George Karypis
Mingyi Hong
Rasool Fakoor
32
1
0
15 Mar 2025
Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model
Qiyuan Deng
X. Bai
Kehai Chen
Yaowei Wang
Liqiang Nie
Min Zhang
OffRL
55
0
0
13 Mar 2025
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments
Dongping Li
Tielong Cai
Tianci Tang
Wenhao Chai
Katherine Rose Driggs-Campbell
Gaoang Wang
LM&Ro
56
0
0
11 Mar 2025
Soft Policy Optimization: Online Off-Policy RL for Sequence Models
Taco Cohen
David W. Zhang
Kunhao Zheng
Yunhao Tang
Rémi Munos
Gabriel Synnaeve
OffRL
78
0
0
07 Mar 2025
Improving LLM Safety Alignment with Dual-Objective Optimization
Xuandong Zhao
Will Cai
Tianneng Shi
David Huang
Licong Lin
Song Mei
Dawn Song
AAML
MU
59
1
0
05 Mar 2025
Preserving Cultural Identity with Context-Aware Translation Through Multi-Agent AI Systems
Mahfuz Ahmed Anik
Abdur Rahman
Azmine Toushik Wasi
Md Manjurul Ahsan
47
1
0
05 Mar 2025
All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
Gokul Swamy
Sanjiban Choudhury
Wen Sun
Zhiwei Steven Wu
J. Andrew Bagnell
OffRL
42
7
0
03 Mar 2025
Preference Learning Unlocks LLMs' Psycho-Counseling Skills
Mian Zhang
S. Eack
Zhiyu Zoey Chen
72
1
0
27 Feb 2025
Controlled Diversity: Length-optimized Natural Language Generation
Diana Marie Schenke
Timo Baumann
39
0
0
26 Feb 2025
Self-Memory Alignment: Mitigating Factual Hallucinations with Generalized Improvement
Siyuan Zhang
Y. Zhang
Yinpeng Dong
Hang Su
HILM
KELM
84
0
0
26 Feb 2025
Discriminative Finetuning of Generative Large Language Models without Reward Models and Human Preference Data
Siqi Guo
Ilgee Hong
Vicente Balmaseda
Changlong Yu
Liang Qiu
Xin Liu
Haoming Jiang
Tuo Zhao
Tianbao Yang
41
0
0
25 Feb 2025
Training a Generally Curious Agent
Fahim Tajwar
Yiding Jiang
Abitha Thankaraj
Sumaita Sadia Rahman
J. Zico Kolter
Jeff Schneider
Ruslan Salakhutdinov
115
1
0
24 Feb 2025
Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance
Chenghua Huang
Lu Wang
Fangkai Yang
Pu Zhao
Z. Li
Qingwei Lin
Dongmei Zhang
Saravan Rajmohan
Qi Zhang
OffRL
50
1
0
24 Feb 2025
RLTHF: Targeted Human Feedback for LLM Alignment
Yifei Xu
Tusher Chakraborty
Emre Kıcıman
Bibek Aryal
Eduardo Rodrigues
...
Rafael Padilha
Leonardo Nunes
Shobana Balakrishnan
Songwu Lu
Ranveer Chandra
93
1
0
24 Feb 2025
C-3DPO: Constrained Controlled Classification for Direct Preference Optimization
Kavosh Asadi
Julien Han
Xingzi Xu
Dominique Perrault-Joncas
Shoham Sabach
Karim Bouyarmane
Mohammad Ghavamzadeh
29
0
0
22 Feb 2025
Towards User-level Private Reinforcement Learning with Human Feedback
J. Zhang
Mingxi Lei
Meng Ding
Mengdi Li
Zihang Xiang
Difei Xu
Jinhui Xu
Di Wang
34
0
0
22 Feb 2025
Policy-to-Language: Train LLMs to Explain Decisions with Flow-Matching Generated Rewards
Xinyi Yang
Liang Zeng
Heng Dong
C. Yu
X. Wu
H. Yang
Yu Wang
Milind Tambe
Tonghan Wang
68
2
0
18 Feb 2025
CONSTRUCTA: Automating Commercial Construction Schedules in Fabrication Facilities with Large Language Models
Yifan Zhang
Xue Yang
43
0
0
17 Feb 2025
LLMs Can Teach Themselves to Better Predict the Future
Benjamin Turtel
Danny Franklin
Philipp Schoenegger
LRM
57
0
0
07 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
113
3
0
06 Feb 2025
Foundations of GenIR
Qingyao Ai
Jingtao Zhan
Y. Liu
42
0
0
06 Jan 2025
Geometric-Averaged Preference Optimization for Soft Preference Labels
Hiroki Furuta
Kuang-Huei Lee
Shixiang Shane Gu
Y. Matsuo
Aleksandra Faust
Heiga Zen
Izzeddin Gur
50
6
0
31 Dec 2024
REFA: Reference Free Alignment for multi-preference optimization
Taneesh Gupta
Rahul Madhavan
Xuchao Zhang
Chetan Bansal
Saravan Rajmohan
81
1
0
20 Dec 2024
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang
Shengyu Zhang
J. Zhang
Runyi Hu
Xiaoya Li
Tianwei Zhang
Jiwei Li
Fei Wu
G. Wang
Eduard H. Hovy
OffRL
114
6
0
05 Dec 2024
Interpreting Language Reward Models via Contrastive Explanations
Junqi Jiang
Tom Bewley
Saumitra Mishra
Freddy Lecue
Manuela Veloso
74
0
0
25 Nov 2024
Chain of Alignment: Integrating Public Will with Expert Intelligence for Language Model Alignment
Andrew Konya
Aviv Ovadya
K. J. Kevin Feng
Quan Ze Chen
Lisa Schirch
Colin Irwin
Amy X. Zhang
ALM
44
0
0
15 Nov 2024
SEE-DPO: Self Entropy Enhanced Direct Preference Optimization
Shivanshu Shekhar
Shreyas Singh
Tong Zhang
27
4
0
06 Nov 2024
Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization
Xiyue Peng
Hengquan Guo
Jiawei Zhang
Dongqing Zou
Ziyu Shao
Honghao Wei
Xin Liu
32
0
0
25 Oct 2024
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch
Shengyi Huang
Sophie Xhonneux
Arian Hosseini
Rishabh Agarwal
Aaron C. Courville
OffRL
77
4
0
23 Oct 2024
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
Songtao Jiang
Yan Zhang
Ruizhe Chen
Yeying Jin
Zuozhu Liu
MLLM
MoE
19
6
0
20 Oct 2024
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
H. Fernando
Han Shen
Parikshit Ram
Yi Zhou
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
CLL
50
2
0
20 Oct 2024
On Designing Effective RL Reward at Training Time for LLM Reasoning
Jiaxuan Gao
Shusheng Xu
Wenjie Ye
Weilin Liu
Chuyi He
Wei Fu
Zhiyu Mei
Guangju Wang
Yi Wu
OffRL
LRM
23
10
0
19 Oct 2024
POROver: Improving Safety and Reducing Overrefusal in Large Language Models with Overgeneration and Preference Optimization
Batuhan K. Karaman
Ishmam Zabir
Alon Benhaim
Vishrav Chaudhary
M. Sabuncu
Xia Song
AI4CE
27
0
0
16 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
25
0
0
14 Oct 2024
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
89
12
0
11 Oct 2024
1
2
3
Next