Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.12036
Cited By
A General Theoretical Paradigm to Understand Learning from Human Preferences
18 October 2023
M. G. Azar
Mark Rowland
Bilal Piot
Daniel Guo
Daniele Calandriello
Michal Valko
Rémi Munos
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A General Theoretical Paradigm to Understand Learning from Human Preferences"
50 / 415 papers shown
Title
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Yueqin Yin
Shentao Yang
Yujia Xie
Ziyi Yang
Yuting Sun
Hany Awadalla
Weizhu Chen
Mingyuan Zhou
48
0
0
07 Jan 2025
Verbosity-Aware Rationale Reduction: Effective Reduction of Redundant Rationale via Principled Criteria
Joonwon Jang
Jaehee Kim
Wonbin Kweon
Hwanjo Yu
LRM
45
1
0
03 Jan 2025
An Overview and Discussion on Using Large Language Models for Implementation Generation of Solutions to Open-Ended Problems
Hashmath Shaik
Alex Doboli
OffRL
ELM
140
0
0
31 Dec 2024
Geometric-Averaged Preference Optimization for Soft Preference Labels
Hiroki Furuta
Kuang-Huei Lee
Shixiang Shane Gu
Y. Matsuo
Aleksandra Faust
Heiga Zen
Izzeddin Gur
50
7
0
31 Dec 2024
Understanding the Logic of Direct Preference Alignment through Logic
Kyle Richardson
Vivek Srikumar
Ashish Sabharwal
85
2
0
23 Dec 2024
JailPO: A Novel Black-box Jailbreak Framework via Preference Optimization against Aligned LLMs
H. Li
Jiawei Ye
Jie Wu
Tianjie Yan
Chu Wang
Zhixin Li
AAML
67
0
0
20 Dec 2024
REFA: Reference Free Alignment for multi-preference optimization
Taneesh Gupta
Rahul Madhavan
Xuchao Zhang
Chetan Bansal
Saravan Rajmohan
89
1
0
20 Dec 2024
Energy-Based Preference Model Offers Better Offline Alignment than the Bradley-Terry Preference Model
Yuzhong Hong
Hanshan Zhang
Junwei Bao
Hongfei Jiang
Yang Song
OffRL
77
1
0
18 Dec 2024
The Superalignment of Superhuman Intelligence with Large Language Models
Minlie Huang
Yingkang Wang
Shiyao Cui
Pei Ke
J. Tang
107
1
0
15 Dec 2024
Hybrid Preference Optimization for Alignment: Provably Faster Convergence Rates by Combining Offline Preferences with Online Exploration
Avinandan Bose
Zhihan Xiong
Aadirupa Saha
S. Du
Maryam Fazel
71
1
0
13 Dec 2024
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang
Shengyu Zhang
J. Zhang
Runyi Hu
Xiaoya Li
Tianwei Zhang
Jiwei Li
Fei Wu
G. Wang
Eduard H. Hovy
OffRL
134
7
0
05 Dec 2024
Time-Reversal Provides Unsupervised Feedback to LLMs
Yerram Varun
Rahul Madhavan
Sravanti Addepalli
A. Suggala
Karthikeyan Shanmugam
Prateek Jain
LRM
SyDa
64
0
0
03 Dec 2024
VideoSAVi: Self-Aligned Video Language Models without Human Supervision
Yogesh Kulkarni
Pooyan Fazli
VLM
100
2
0
01 Dec 2024
Learning from Relevant Subgoals in Successful Dialogs using Iterative Training for Task-oriented Dialog Systems
Magdalena Kaiser
P. Ernst
György Szarvas
72
0
0
25 Nov 2024
DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs
Zhihan Liu
Shenao Zhang
Yongfei Liu
Boyi Liu
Yingxiang Yang
Zhaoran Wang
111
2
0
20 Nov 2024
Reward Modeling with Ordinal Feedback: Wisdom of the Crowd
Shang Liu
Yu Pan
Guanting Chen
Xiaocheng Li
75
2
0
19 Nov 2024
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Xinyan Guan
Yanjiang Liu
Xinyu Lu
Boxi Cao
Ben He
...
Le Sun
Jie Lou
Bowen Yu
Y. Lu
Hongyu Lin
ALM
79
2
0
18 Nov 2024
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Weiyun Wang
Zhe Chen
Wenhai Wang
Yue Cao
Yangzhou Liu
...
Jinguo Zhu
X. Zhu
Lewei Lu
Yu Qiao
Jifeng Dai
LRM
59
46
1
15 Nov 2024
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
A. Jain
Harley Wiltzer
Jesse Farebrother
Irina Rish
Glen Berseth
Sanjiban Choudhury
49
1
0
11 Nov 2024
Towards Improved Preference Optimization Pipeline: from Data Generation to Budget-Controlled Regularization
Zhuotong Chen
Fang Liu
Jennifer Zhu
Wanyu Du
Yanjun Qi
33
0
0
07 Nov 2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
57
3
0
07 Nov 2024
Sample-Efficient Alignment for LLMs
Zichen Liu
Changyu Chen
Chao Du
Wee Sun Lee
Min-Bin Lin
36
3
0
03 Nov 2024
TODO: Enhancing LLM Alignment with Ternary Preferences
Yuxiang Guo
Lu Yin
Bo Jiang
Jiaqi Zhang
33
1
0
02 Nov 2024
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Y. Liu
Argyris Oikonomou
Weiqiang Zheng
Yang Cai
Arman Cohan
29
1
0
30 Oct 2024
Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
Sheryl Hsu
Omar Khattab
Chelsea Finn
Archit Sharma
KELM
RALM
36
5
0
30 Oct 2024
VPO: Leveraging the Number of Votes in Preference Optimization
Jae Hyeon Cho
Minkyung Park
Byung-Jun Lee
22
1
0
30 Oct 2024
f
f
f
-PO: Generalizing Preference Optimization with
f
f
f
-divergence Minimization
Jiaqi Han
Mingjian Jiang
Yuxuan Song
J. Leskovec
Stefano Ermon
51
3
0
29 Oct 2024
Matryoshka: Learning to Drive Black-Box LLMs with LLMs
Changhao Li
Yuchen Zhuang
Rushi Qiang
Haotian Sun
H. Dai
Chao Zhang
Bo Dai
LRM
26
4
0
28 Oct 2024
UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function
Zhichao Wang
Bin Bi
Z. Zhu
Xiangbo Mao
Jun Wang
Shiyu Wang
CLL
26
1
0
28 Oct 2024
Learning from Response not Preference: A Stackelberg Approach for LLM Detoxification using Non-parallel Data
Xinhong Xie
Tao Li
Quanyan Zhu
24
3
0
27 Oct 2024
Fast Best-of-N Decoding via Speculative Rejection
Hanshi Sun
Momin Haider
Ruiqi Zhang
Huitao Yang
Jiahao Qiu
Ming Yin
Mengdi Wang
Peter L. Bartlett
Andrea Zanette
BDL
40
28
0
26 Oct 2024
Uncertainty-Penalized Direct Preference Optimization
Sam Houliston
Alizée Pace
Alexander Immer
Gunnar Rätsch
29
0
0
26 Oct 2024
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
Shilong Li
Yancheng He
Hui Huang
Xingyuan Bu
J. Liu
Hangyu Guo
Weixun Wang
Jihao Gu
Wenbo Su
Bo Zheng
29
5
0
25 Oct 2024
Inference time LLM alignment in single and multidomain preference spectrum
S.
Zheng Qi
Nikolaos Pappas
Srikanth Doss Kadarundalagi Raghuram Doss
Monica Sunkara
Kishaloy Halder
Manuel Mager
Yassine Benajiba
32
0
0
24 Oct 2024
Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
Wenhong Zhu
Zhiwei He
Xiaofeng Wang
Pengfei Liu
Rui Wang
OSLM
47
3
0
24 Oct 2024
Scalable Ranked Preference Optimization for Text-to-Image Generation
Shyamgopal Karthik
Huseyin Coskun
Zeynep Akata
Sergey Tulyakov
J. Ren
Anil Kag
EGVM
52
4
0
23 Oct 2024
Optimal Design for Reward Modeling in RLHF
Antoine Scheid
Etienne Boursier
Alain Durmus
Michael I. Jordan
Pierre Ménard
Eric Moulines
Michal Valko
OffRL
40
5
0
22 Oct 2024
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
H. Fernando
Han Shen
Parikshit Ram
Yi Zhou
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
CLL
50
2
0
20 Oct 2024
GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets
Oh Joon Kwon
Daiki E. Matsunaga
Kee-Eung Kim
AI4CE
24
0
0
19 Oct 2024
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
Jiahao Qiu
Yifu Lu
Yifan Zeng
Jiacheng Guo
Jiayi Geng
Huazheng Wang
Kaixuan Huang
Yue Wu
Mengdi Wang
34
22
0
18 Oct 2024
Optimizing Preference Alignment with Differentiable NDCG Ranking
Jiacong Zhou
Xianyun Wang
Jun Yu
23
2
0
17 Oct 2024
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
Xingqi Wang
Xiaoyuan Yi
Xing Xie
Jia Jia
19
1
0
16 Oct 2024
Preference Optimization with Multi-Sample Comparisons
Chaoqi Wang
Zhuokai Zhao
Chen Zhu
Karthik Abinav Sankararaman
Michal Valko
...
Zhaorun Chen
Madian Khabsa
Yuxin Chen
Hao Ma
Sinong Wang
62
10
0
16 Oct 2024
Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse Reinforcement Learning
Jared Joselowitz
Arjun Jagota
Satyapriya Krishna
Sonali Parbhoo
Nyal Patel
Satyapriya Krishna
Sonali Parbhoo
24
0
0
16 Oct 2024
CREAM: Consistency Regularized Self-Rewarding Language Models
Z. Wang
Weilei He
Zhiyuan Liang
Xuchao Zhang
Chetan Bansal
Ying Wei
Weitong Zhang
Huaxiu Yao
ALM
96
7
0
16 Oct 2024
Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
Zhengyan Shi
Sander Land
Acyr F. Locatelli
Matthieu Geist
Max Bartolo
46
4
0
15 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
30
0
0
14 Oct 2024
How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective
Teng Xiao
Mingxiao Li
Yige Yuan
Huaisheng Zhu
Chao Cui
V. Honavar
ALM
26
7
0
14 Oct 2024
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps
Han Wang
Yilin Zhao
Dian Li
Xiaohan Wang
Gang Liu
Xuguang Lan
H. Wang
LRM
40
1
0
14 Oct 2024
Taming Overconfidence in LLMs: Reward Calibration in RLHF
Jixuan Leng
Chengsong Huang
Banghua Zhu
Jiaxin Huang
26
7
0
13 Oct 2024
Previous
1
2
3
4
5
6
7
8
9
Next