Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2310.12036
Cited By
v1
v2 (latest)
A General Theoretical Paradigm to Understand Learning from Human Preferences
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
18 October 2023
M. G. Azar
Mark Rowland
Bilal Piot
Daniel Guo
Daniele Calandriello
Michal Valko
Rémi Munos
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (16 upvotes)
Papers citing
"A General Theoretical Paradigm to Understand Learning from Human Preferences"
50 / 574 papers shown
Title
Preference Optimization with Multi-Sample Comparisons
Chaoqi Wang
Zhuokai Zhao
Chen Zhu
Karthik Abinav Sankararaman
Michal Valko
...
Zhaorun Chen
Madian Khabsa
Yuxin Chen
Hao Ma
Sinong Wang
269
13
0
16 Oct 2024
Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
Zhengyan Shi
Sander Land
Acyr Locatelli
Matthieu Geist
Max Bartolo
275
9
0
15 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
International Conference on Learning Representations (ICLR), 2024
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
169
4
0
14 Oct 2024
How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Teng Xiao
Mingxiao Li
Yige Yuan
Huaisheng Zhu
Chao Cui
V. Honavar
ALM
178
15
0
14 Oct 2024
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps
International Conference on Learning Representations (ICLR), 2024
Han Wang
Yilin Zhao
Dian Li
Xiaohan Wang
Gang Liu
Xuguang Lan
Jian Shu
LRM
347
3
0
14 Oct 2024
Taming Overconfidence in LLMs: Reward Calibration in RLHF
International Conference on Learning Representations (ICLR), 2024
Jixuan Leng
Chengsong Huang
Banghua Zhu
Jiaxin Huang
301
35
0
13 Oct 2024
Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
Xiaoran Jiao
Weian Mao
Wengong Jin
Peiyuan Yang
Hao Chen
Chunhua Shen
159
5
0
12 Oct 2024
VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Lei Li
Zhihui Xie
Mukai Li
Shunian Chen
Peiyi Wang
L. Chen
Yazheng Yang
Benyou Wang
Dianbo Sui
Qiang Liu
VLM
ALM
334
50
0
12 Oct 2024
Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
International Conference on Learning Representations (ICLR), 2024
Huayu Chen
Hang Su
Peize Sun
Jun Zhu
VLM
176
10
0
12 Oct 2024
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Abhijnan Nath
Changsoo Jung
Ethan Seefried
Nikhil Krishnaswamy
935
5
0
11 Oct 2024
PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning
Tingchen Fu
Mrinank Sharma
Juil Sock
Shay B. Cohen
David M. Krueger
Fazl Barez
AAML
402
24
0
11 Oct 2024
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
International Conference on Learning Representations (ICLR), 2024
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
587
46
0
11 Oct 2024
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Guanlin Liu
Kaixuan Ji
Ning Dai
Zheng Wu
Chen Dun
Q. Gu
Lin Yan
Quanquan Gu
Lin Yan
OffRL
LRM
271
18
0
11 Oct 2024
Evolutionary Contrastive Distillation for Language Model Alignment
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Julian Katz-Samuels
Zheng Li
Hyokun Yun
Priyanka Nigam
Yi Xu
Vaclav Petricek
Bing Yin
Trishul Chilimbi
ALM
SyDa
72
1
0
10 Oct 2024
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Shenao Zhang
Zhihan Liu
Boyi Liu
Yanzhe Zhang
Yingxiang Yang
Yunxing Liu
Liyu Chen
Tao Sun
Ziyi Wang
459
5
0
10 Oct 2024
COS-DPO: Conditioned One-Shot Multi-Objective Fine-Tuning Framework
Conference on Uncertainty in Artificial Intelligence (UAI), 2024
Yinuo Ren
Tesi Xiao
Michael Shavlovsky
Lexing Ying
Holakou Rahmanian
224
0
0
10 Oct 2024
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
Chongyu Fan
Jiancheng Liu
Licong Lin
Jinghan Jia
Ruiqi Zhang
Song Mei
Sijia Liu
MU
461
65
0
09 Oct 2024
Accelerated Preference Optimization for Large Language Model Alignment
Jiafan He
Huizhuo Yuan
Q. Gu
161
4
0
08 Oct 2024
As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
International Conference on Learning Representations (ICLR), 2024
Xin Mao
Feng-Lin Li
Huimin Xu
Wei Zhang
Wang Chen
Anh Tuan Luu
260
3
0
07 Oct 2024
Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yew Ken Chia
Guizhen Chen
Weiwen Xu
Luu Anh Tuan
Soujanya Poria
Lidong Bing
LRM
152
3
0
07 Oct 2024
DOPL: Direct Online Preference Learning for Restless Bandits with Preference Feedback
International Conference on Learning Representations (ICLR), 2024
Efstathia Soufleri
Ujwal Dinesha
Debajoy Mukherjee
Jian Li
Srinivas Shakkottai
265
2
0
07 Oct 2024
SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Fenia Christopoulou
Ronald Cardenas
Gerasimos Lampouras
H. Ammar
Jun Wang
183
5
0
07 Oct 2024
LRHP: Learning Representations for Human Preferences via Preference Pairs
Chenglong Wang
Yang Gan
Yifu Huo
Yongyu Mu
Qiaozhi He
Murun Yang
Tong Xiao
Chunliang Zhang
Tongran Liu
Jingbo Zhu
AI4TS
254
3
0
06 Oct 2024
Latent Feature Mining for Predictive Model Enhancement with Large Language Models
Bingxuan Li
Pengyi Shi
Amy Ward
217
24
0
06 Oct 2024
MVP-Bench: Can Large Vision--Language Models Conduct Multi-level Visual Perception Like Humans?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Guanzhen Li
Yuxi Xie
Min-Yen Kan
VLM
761
2
0
06 Oct 2024
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
International Conference on Learning Representations (ICLR), 2024
Zhaolin Gao
Wenhao Zhan
Jonathan D. Chang
Gokul Swamy
Kianté Brantley
Jason D. Lee
Wen Sun
OffRL
331
13
0
06 Oct 2024
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification
Zhenwen Liang
Ye Liu
Tong Niu
Xiangliang Zhang
Yingbo Zhou
Semih Yavuz
LRM
182
34
0
05 Oct 2024
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
International Conference on Learning Representations (ICLR), 2024
Hanyang Zhao
Genta Indra Winata
Anirban Das
Shi-Xiong Zhang
D. Yao
Wenpin Tang
Sambit Sahu
273
17
0
05 Oct 2024
Learning Code Preference via Synthetic Evolution
Jiawei Liu
Thanh Nguyen
Mingyue Shang
Hantian Ding
Xiaopeng Li
Yu Yu
Varun Kumar
Zijian Wang
SyDa
ALM
AAML
171
17
0
04 Oct 2024
Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Kyuyoung Kim
Ah Jeong Seo
Hao Liu
Jinwoo Shin
Kimin Lee
132
8
0
04 Oct 2024
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
International Conference on Learning Representations (ICLR), 2024
Haoran Xu
Kenton W. Murray
Philipp Koehn
Hieu T. Hoang
Akiko Eriguchi
Huda Khayrallah
239
27
0
04 Oct 2024
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation
Rohin Manvi
Anikait Singh
Stefano Ermon
SyDa
160
48
0
03 Oct 2024
Strong Preferences Affect the Robustness of Preference Models and Value Alignment
International Conference on Learning Representations (ICLR), 2024
Ziwei Xu
Mohan Kankanhalli
AAML
62
0
0
03 Oct 2024
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
International Conference on Learning Representations (ICLR), 2024
Yekun Chai
Haoran Sun
Huang Fang
Shuohuan Wang
Yu Sun
Hua Wu
869
5
0
03 Oct 2024
Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment
Yifan Zhang
Ge Zhang
Yue Wu
Kangping Xu
Quanquan Gu
356
3
0
03 Oct 2024
Generative Reward Models
Dakota Mahan
Duy Phung
Rafael Rafailov
Chase Blagden
Nathan Lile
Louis Castricato
Jan-Philipp Fränken
Chelsea Finn
Alon Albalak
VLM
SyDa
OffRL
198
78
0
02 Oct 2024
Beyond Scalar Reward Model: Learning Generative Judge from Preference Data
Ziyi Ye
Xiangsheng Li
Qiuchi Li
Jiaxin Mao
Yujia Zhou
Wei Shen
Dong Yan
Yiqun Liu
196
32
0
01 Oct 2024
The Crucial Role of Samplers in Online Direct Preference Optimization
International Conference on Learning Representations (ICLR), 2024
Ruizhe Shi
Runlong Zhou
Simon S. Du
471
15
0
29 Sep 2024
Evaluation of Large Language Models for Summarization Tasks in the Medical Domain: A Narrative Review
Emma Croxford
Yanjun Gao
Nicholas Pellegrino
Karen K. Wong
Graham Wills
Elliot First
Frank J. Liao
Cherodeep Goswami
Brian Patterson
Majid Afshar
HILM
ELM
LM&MA
297
4
0
26 Sep 2024
Inference-Time Language Model Alignment via Integrated Value Guidance
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zhixuan Liu
Zhanhui Zhou
Yuanfu Wang
Chao Yang
Yu Qiao
115
15
0
26 Sep 2024
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jian Li
Haojing Huang
Yujia Zhang
Pengfei Xu
Xi Chen
Rui Song
Lida Shi
Jingwen Wang
Hao Xu
108
2
0
26 Sep 2024
Modulated Intervention Preference Optimization (MIPO): Keep the Easy, Refine the Difficult
Cheolhun Jang
208
0
0
26 Sep 2024
Just Say What You Want: Only-prompting Self-rewarding Online Preference Optimization
Ruijie Xu
Zhihan Liu
Yongfei Liu
Shipeng Yan
Zhaoran Wang
Zhi-Li Zhang
Xuming He
ALM
173
1
0
26 Sep 2024
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
International Conference on Learning Representations (ICLR), 2024
Qining Zhang
Lei Ying
OffRL
370
7
0
25 Sep 2024
On Extending Direct Preference Optimization to Accommodate Ties
Jinghong Chen
Guangyu Yang
Weizhe Lin
Jingbiao Mei
Bill Byrne
153
4
0
25 Sep 2024
Orthogonal Finetuning for Direct Preference Optimization
Chenxu Yang
Ruipeng Jia
Naibin Gu
Zheng Lin
Siyuan Chen
Chao Pang
Weichong Yin
Yu Sun
Hua Wu
Weiping Wang
230
6
0
23 Sep 2024
Backtracking Improves Generation Safety
Yiming Zhang
Jianfeng Chi
Hailey Nguyen
Kartikeya Upasani
Daniel M. Bikel
Jason Weston
Eric Michael Smith
SILM
263
24
0
22 Sep 2024
RRM: Robust Reward Model Training Mitigates Reward Hacking
International Conference on Learning Representations (ICLR), 2024
Tianqi Liu
Wei Xiong
Jie Jessie Ren
Lichang Chen
Junru Wu
...
Yuan Liu
Bilal Piot
Abe Ittycheriah
Aviral Kumar
Mohammad Saleh
AAML
195
43
0
20 Sep 2024
Preference Alignment Improves Language Model-Based TTS
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Jinchuan Tian
Chunlei Zhang
Jiatong Shi
Hao Zhang
Jianwei Yu
Shinji Watanabe
Dong Yu
209
20
0
19 Sep 2024
Reward-Robust RLHF in LLMs
Yuzi Yan
Xingzhou Lou
Jialian Li
Yiping Zhang
Jian Xie
Chao Yu
Yu Wang
Dong Yan
Yuan Shen
248
17
0
18 Sep 2024
Previous
1
2
3
...
6
7
8
...
10
11
12
Next