ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.12036
  4. Cited By
A General Theoretical Paradigm to Understand Learning from Human
  Preferences

A General Theoretical Paradigm to Understand Learning from Human Preferences

18 October 2023
M. G. Azar
Mark Rowland
Bilal Piot
Daniel Guo
Daniele Calandriello
Michal Valko
Rémi Munos
ArXivPDFHTML

Papers citing "A General Theoretical Paradigm to Understand Learning from Human Preferences"

50 / 415 papers shown
Title
WPO: Enhancing RLHF with Weighted Preference Optimization
WPO: Enhancing RLHF with Weighted Preference Optimization
Wenxuan Zhou
Ravi Agrawal
Shujian Zhang
Sathish Indurthi
Sanqiang Zhao
Kaiqiang Song
Silei Xu
Chenguang Zhu
35
17
0
17 Jun 2024
Measuring memorization in RLHF for code completion
Measuring memorization in RLHF for code completion
Aneesh Pappu
Billy Porter
Ilia Shumailov
Jamie Hayes
31
0
0
17 Jun 2024
Nemotron-4 340B Technical Report
Nemotron-4 340B Technical Report
Nvidia
:
Bo Adler
Niket Agarwal
Ashwath Aithal
...
Jimmy Zhang
Jing Zhang
Vivienne Zhang
Yian Zhang
Chen Zhu
36
55
0
17 Jun 2024
A Survey on Human Preference Learning for Large Language Models
A Survey on Human Preference Learning for Large Language Models
Ruili Jiang
Kehai Chen
Xuefeng Bai
Zhixuan He
Juntao Li
Muyun Yang
Tiejun Zhao
Liqiang Nie
Min Zhang
41
8
0
17 Jun 2024
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Wenkai Yang
Shiqi Shen
Guangyao Shen
Zhi Gong
Yankai Lin
Zhi Gong
Yankai Lin
Ji-Rong Wen
50
13
0
17 Jun 2024
Toward Optimal LLM Alignments Using Two-Player Games
Toward Optimal LLM Alignments Using Two-Player Games
Rui Zheng
Hongyi Guo
Zhihan Liu
Xiaoying Zhang
Yuanshun Yao
...
Tao Gui
Qi Zhang
Xuanjing Huang
Hang Li
Yang Liu
58
5
0
16 Jun 2024
Eliminating Biased Length Reliance of Direct Preference Optimization via
  Down-Sampled KL Divergence
Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Junru Lu
Jiazheng Li
Siyu An
Meng Zhao
Yulan He
Di Yin
Xing Sun
39
13
0
16 Jun 2024
Self-Evolution Fine-Tuning for Policy Optimization
Self-Evolution Fine-Tuning for Policy Optimization
Ruijun Chen
Jiehao Liang
Shiping Gao
Fanqi Wan
Xiaojun Quan
35
0
0
16 Jun 2024
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for
  Cartoon Captioning
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Jifan Zhang
Lalit P. Jain
Yang Guo
Jiayi Chen
Kuan Lok Zhou
...
Scott Sievert
Timothy Rogers
Kevin Jamieson
Robert Mankoff
Robert Nowak
31
5
0
15 Jun 2024
Knowledge Editing in Language Models via Adapted Direct Preference
  Optimization
Knowledge Editing in Language Models via Adapted Direct Preference Optimization
Amit Rozner
Barak Battash
Lior Wolf
Ofir Lindenbaum
KELM
52
8
0
14 Jun 2024
Bootstrapping Language Models with DPO Implicit Rewards
Bootstrapping Language Models with DPO Implicit Rewards
Changyu Chen
Zichen Liu
Chao Du
Tianyu Pang
Qian Liu
Arunesh Sinha
Pradeep Varakantham
Min-Bin Lin
SyDa
ALM
62
23
0
14 Jun 2024
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks
  and Algorithms
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Miaosen Zhang
Yixuan Wei
Zhen Xing
Yifei Ma
Zuxuan Wu
...
Zheng-Wei Zhang
Qi Dai
Chong Luo
Xin Geng
Baining Guo
VLM
46
1
0
13 Jun 2024
Unpacking DPO and PPO: Disentangling Best Practices for Learning from
  Preference Feedback
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
Hamish Ivison
Yizhong Wang
Jiacheng Liu
Zeqiu Wu
Valentina Pyatkin
Nathan Lambert
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
44
38
0
13 Jun 2024
On Softmax Direct Preference Optimization for Recommendation
On Softmax Direct Preference Optimization for Recommendation
Yuxin Chen
Junfei Tan
An Zhang
Zhengyi Yang
Leheng Sheng
Enzhi Zhang
Xiang Wang
Tat-Seng Chua
29
24
0
13 Jun 2024
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning
  in LLMs
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs
Xuan Zhang
Chao Du
Tianyu Pang
Qian Liu
Wei Gao
Min-Bin Lin
LRM
AI4CE
44
34
0
13 Jun 2024
ContraSolver: Self-Alignment of Language Models by Resolving Internal
  Preference Contradictions
ContraSolver: Self-Alignment of Language Models by Resolving Internal Preference Contradictions
Xu Zhang
Xunjian Yin
Xiaojun Wan
40
3
0
13 Jun 2024
Mistral-C2F: Coarse to Fine Actor for Analytical and Reasoning
  Enhancement in RLHF and Effective-Merged LLMs
Mistral-C2F: Coarse to Fine Actor for Analytical and Reasoning Enhancement in RLHF and Effective-Merged LLMs
Chen Zheng
Ke Sun
Xun Zhou
MoE
47
0
0
12 Jun 2024
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous
  Preferences
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences
Daiwei Chen
Yi Chen
Aniket Rege
Ramya Korlakai Vinayak
38
17
0
12 Jun 2024
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs
  with Nothing
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Yuntian Deng
Radha Poovendran
Yejin Choi
Bill Yuchen Lin
SyDa
32
111
0
12 Jun 2024
Discovering Preference Optimization Algorithms with and for Large
  Language Models
Discovering Preference Optimization Algorithms with and for Large Language Models
Chris Xiaoxuan Lu
Samuel Holt
Claudio Fanconi
Alex J. Chan
Jakob Foerster
M. Schaar
R. T. Lange
OffRL
32
14
0
12 Jun 2024
OPTune: Efficient Online Preference Tuning
OPTune: Efficient Online Preference Tuning
Lichang Chen
Jiuhai Chen
Chenxi Liu
John Kirchenbauer
Davit Soselia
Chen Zhu
Tom Goldstein
Tianyi Zhou
Heng Huang
34
4
0
11 Jun 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
50
1
0
11 Jun 2024
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward
Yuzi Yan
Yibo Miao
J. Li
Yipin Zhang
Jian Xie
Zhijie Deng
Dong Yan
49
11
0
11 Jun 2024
Distributional Preference Alignment of LLMs via Optimal Transport
Distributional Preference Alignment of LLMs via Optimal Transport
Igor Melnyk
Youssef Mroueh
Brian M. Belgodere
Mattia Rigotti
Apoorva Nitsure
Mikhail Yurochkin
Kristjan Greenewald
Jirí Navrátil
Jerret Ross
42
11
0
09 Jun 2024
UltraMedical: Building Specialized Generalists in Biomedicine
UltraMedical: Building Specialized Generalists in Biomedicine
Kaiyan Zhang
Sihang Zeng
Ermo Hua
Ning Ding
Zhang-Ren Chen
...
Xuekai Zhu
Xingtai Lv
Hu Jinfang
Zhiyuan Liu
Bowen Zhou
LM&MA
39
20
0
06 Jun 2024
Scaling Laws for Reward Model Overoptimization in Direct Alignment
  Algorithms
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Rafael Rafailov
Yaswanth Chittepu
Ryan Park
Harshit S. Sikchi
Joey Hejna
Bradley Knox
Chelsea Finn
S. Niekum
50
48
0
05 Jun 2024
Adaptive Preference Scaling for Reinforcement Learning with Human
  Feedback
Adaptive Preference Scaling for Reinforcement Learning with Human Feedback
Ilgee Hong
Zichong Li
Alexander Bukharin
Yixiao Li
Haoming Jiang
Tianbao Yang
Tuo Zhao
29
4
0
04 Jun 2024
Dishonesty in Helpful and Harmless Alignment
Dishonesty in Helpful and Harmless Alignment
Youcheng Huang
Jingkun Tang
Duanyu Feng
Zheng-Wei Zhang
Wenqiang Lei
Jiancheng Lv
Anthony G. Cohn
LLMSV
38
3
0
04 Jun 2024
Self-Improving Robust Preference Optimization
Self-Improving Robust Preference Optimization
Eugene Choi
Arash Ahmadian
Matthieu Geist
Oilvier Pietquin
M. G. Azar
28
8
0
03 Jun 2024
BoNBoN Alignment for Large Language Models and the Sweetness of
  Best-of-n Sampling
BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling
Lin Gui
Cristina Garbacea
Victor Veitch
BDL
LM&MA
41
36
0
02 Jun 2024
Aligning Language Models with Demonstrated Feedback
Aligning Language Models with Demonstrated Feedback
Omar Shaikh
Michelle S. Lam
Joey Hejna
Yijia Shao
Michael S. Bernstein
Michael S. Bernstein
Diyi Yang
ALM
31
23
0
02 Jun 2024
Learning to Clarify: Multi-turn Conversations with Action-Based
  Contrastive Self-Training
Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
Maximillian Chen
Ruoxi Sun
Sercan Ö. Arik
Tomas Pfister
LLMAG
29
6
0
31 May 2024
Direct Alignment of Language Models via Quality-Aware Self-Refinement
Direct Alignment of Language Models via Quality-Aware Self-Refinement
Runsheng Yu
Yong Wang
Xiaoqi Jiao
Youzhi Zhang
James T. Kwok
48
7
0
31 May 2024
Transfer Q Star: Principled Decoding for LLM Alignment
Transfer Q Star: Principled Decoding for LLM Alignment
Souradip Chakraborty
Soumya Suvra Ghosal
Ming Yin
Dinesh Manocha
Mengdi Wang
Amrit Singh Bedi
Furong Huang
44
24
0
30 May 2024
Group Robust Preference Optimization in Reward-free RLHF
Group Robust Preference Optimization in Reward-free RLHF
Shyam Sundhar Ramesh
Yifan Hu
Iason Chaimalas
Viraj Mehta
Pier Giuseppe Sessa
Haitham Bou-Ammar
Ilija Bogunovic
19
23
0
30 May 2024
InstructionCP: A fast approach to transfer Large Language Models into
  target language
InstructionCP: A fast approach to transfer Large Language Models into target language
Kuang-Ming Chen
Hung-yi Lee
CLL
41
2
0
30 May 2024
Preference Alignment with Flow Matching
Preference Alignment with Flow Matching
Minu Kim
Yongsik Lee
Sehyeok Kang
Jihwan Oh
Song Chong
Seyoung Yun
32
1
0
30 May 2024
Enhancing Large Vision Language Models with Self-Training on Image
  Comprehension
Enhancing Large Vision Language Models with Self-Training on Image Comprehension
Yihe Deng
Pan Lu
Fan Yin
Ziniu Hu
Sheng Shen
James Y. Zou
Kai-Wei Chang
Wei Wang
SyDa
VLM
LRM
36
36
0
30 May 2024
Is In-Context Learning Sufficient for Instruction Following in LLMs?
Is In-Context Learning Sufficient for Instruction Following in LLMs?
Hao Zhao
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
67
12
0
30 May 2024
One-Shot Safety Alignment for Large Language Models via Optimal
  Dualization
One-Shot Safety Alignment for Large Language Models via Optimal Dualization
Xinmeng Huang
Shuo Li
Edgar Dobriban
Osbert Bastani
Hamed Hassani
Dongsheng Ding
41
3
0
29 May 2024
Preference Learning Algorithms Do Not Learn Preference Rankings
Preference Learning Algorithms Do Not Learn Preference Rankings
Angelica Chen
Sadhika Malladi
Lily H. Zhang
Xinyi Chen
Qiuyi Zhang
Rajesh Ranganath
Kyunghyun Cho
25
23
0
29 May 2024
AI Risk Management Should Incorporate Both Safety and Security
AI Risk Management Should Incorporate Both Safety and Security
Xiangyu Qi
Yangsibo Huang
Yi Zeng
Edoardo Debenedetti
Jonas Geiping
...
Chaowei Xiao
Bo-wen Li
Dawn Song
Peter Henderson
Prateek Mittal
AAML
43
10
0
29 May 2024
Self-Exploring Language Models: Active Preference Elicitation for Online
  Alignment
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Shenao Zhang
Donghan Yu
Hiteshi Sharma
Ziyi Yang
Shuohang Wang
Hany Hassan
Zhaoran Wang
LRM
40
28
0
29 May 2024
Weak-to-Strong Search: Align Large Language Models via Searching over
  Small Language Models
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
Zhanhui Zhou
Zhixuan Liu
Jie Liu
Zhichen Dong
Chao Yang
Yu Qiao
ALM
36
20
0
29 May 2024
Offline Regularised Reinforcement Learning for Large Language Models
  Alignment
Offline Regularised Reinforcement Learning for Large Language Models Alignment
Pierre Harvey Richemond
Yunhao Tang
Daniel Guo
Daniele Calandriello
M. G. Azar
...
Gil Shamir
Rishabh Joshi
Tianqi Liu
Rémi Munos
Bilal Piot
OffRL
40
21
0
29 May 2024
Robust Preference Optimization through Reward Model Distillation
Robust Preference Optimization through Reward Model Distillation
Adam Fisch
Jacob Eisenstein
Vicky Zayats
Alekh Agarwal
Ahmad Beirami
Chirag Nagpal
Peter Shaw
Jonathan Berant
73
21
0
29 May 2024
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in
  Alignment
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
Keming Lu
Bowen Yu
Fei Huang
Yang Fan
Runji Lin
Chang Zhou
MoMe
24
18
0
28 May 2024
Prompt Optimization with Human Feedback
Prompt Optimization with Human Feedback
Xiaoqiang Lin
Zhongxiang Dai
Arun Verma
See-Kiong Ng
P. Jaillet
K. H. Low
AAML
34
8
0
27 May 2024
Triple Preference Optimization: Achieving Better Alignment with Less
  Data in a Single Step Optimization
Triple Preference Optimization: Achieving Better Alignment with Less Data in a Single Step Optimization
Amir Saeidi
Shivanshu Verma
Aswin Rrv
Chitta Baral
32
0
0
26 May 2024
On the Algorithmic Bias of Aligning Large Language Models with RLHF:
  Preference Collapse and Matching Regularization
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization
Jiancong Xiao
Ziniu Li
Xingyu Xie
E. Getzen
Cong Fang
Qi Long
Weijie J. Su
41
12
0
26 May 2024
Previous
123456789
Next