ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.12036
  4. Cited By
A General Theoretical Paradigm to Understand Learning from Human
  Preferences
v1v2 (latest)

A General Theoretical Paradigm to Understand Learning from Human Preferences

International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
18 October 2023
M. G. Azar
Mark Rowland
Bilal Piot
Daniel Guo
Daniele Calandriello
Michal Valko
Rémi Munos
ArXiv (abs)PDFHTMLHuggingFace (16 upvotes)

Papers citing "A General Theoretical Paradigm to Understand Learning from Human Preferences"

50 / 579 papers shown
Title
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Nicolas Le Roux
Marc G. Bellemare
Jonathan Lebensold
Arnaud Bergeron
Joshua Greaves
Alex Fréchette
Carolyne Pelletier
Eric Thibodeau-Laufer
Sándor Toth
Sam Work
OffRL
466
32
0
18 Mar 2025
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Songjun Tu
Jiahao Lin
Xiangyu Tian
Qichao Zhang
Linjing Li
...
Nan Xu
Wei He
Xiangyuan Lan
Shihong Deng
Dongbin Zhao
LRM
501
13
0
17 Mar 2025
Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward Modeling
Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Qiyuan Deng
X. Bai
Kehai Chen
Yaowei Wang
Liqiang Nie
Min Zhang
OffRL
249
2
0
13 Mar 2025
Robust Multi-Objective Controlled Decoding of Large Language Models
Seongho Son
William Bankes
Sangwoong Yoon
Shyam Sundhar Ramesh
Xiaohang Tang
Ilija Bogunovic
328
5
0
11 Mar 2025
Preference-Based Alignment of Discrete Diffusion Models
Preference-Based Alignment of Discrete Diffusion Models
Umberto Borso
Davide Paglieri
Jude Wells
Tim Rocktaschel
246
6
0
11 Mar 2025
Mitigating Preference Hacking in Policy Optimization with Pessimism
Dhawal Gupta
Adam Fisch
Christoph Dann
Alekh Agarwal
261
2
0
10 Mar 2025
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs
Jongwoo Ko
Tianyi Chen
Sungnyun Kim
Tianyu Ding
Luming Liang
Ilya Zharkov
Se-Young Yun
VLM
972
17
0
10 Mar 2025
RePO: Understanding Preference Learning Through ReLU-Based Optimization
RePO: Understanding Preference Learning Through ReLU-Based Optimization
Junkang Wu
Kexin Huang
Qingsong Wen
Jinyang Gao
Bolin Ding
Jiancan Wu
Xiangnan He
Xiang Wang
265
3
0
10 Mar 2025
ACAI for SBOs: AI Co-creation for Advertising and Inspiration for Small Business Owners
Nimisha Karnatak
Adrien Baranes
Rob Marchant
Triona Butler
Kristen Olson
236
0
0
09 Mar 2025
Research on Superalignment Should Advance Now with Parallel Optimization of Competence and Conformity
HyunJin Kim
Xiaoyuan Yi
Jing Yao
Muhua Huang
Jinyeong Bak
James Evans
Xing Xie
283
0
0
08 Mar 2025
Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Wen Yang
Junhong Wu
Chen Wang
Chengqing Zong
J.N. Zhang
369
5
0
06 Mar 2025
Adding Alignment Control to Language Models
Wenhong Zhu
Weinan Zhang
Rui Wang
304
0
0
06 Mar 2025
FANS -- Formal Answer Selection for Natural Language Math Reasoning Using Lean4
FANS -- Formal Answer Selection for Natural Language Math Reasoning Using Lean4
Jiarui Yao
Ruida Wang
Tong Zhang
LRM
304
2
0
05 Mar 2025
Preserving Cultural Identity with Context-Aware Translation Through Multi-Agent AI Systems
Mahfuz Ahmed Anik
Abdur Rahman
Azmine Toushik Wasi
Md Manjurul Ahsan
339
13
0
05 Mar 2025
Visualising Policy-Reward Interplay to Inform Zeroth-Order Preference Optimisation of Large Language Models
Visualising Policy-Reward Interplay to Inform Zeroth-Order Preference Optimisation of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Alessio Galatolo
Zhenbang Dai
Katie Winkle
Meriem Beloucif
252
0
0
05 Mar 2025
Adversarial Tokenization
Adversarial TokenizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Renato Lui Geh
Zilei Shao
Karen Ullrich
SILMAAML
398
6
0
04 Mar 2025
AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation
AlignDistil: Token-Level Language Model Alignment as Adaptive Policy DistillationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Songming Zhang
Xue Zhang
Tong Zhang
Bojie Hu
Yufeng Chen
Jinan Xu
387
4
0
04 Mar 2025
All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
Gokul Swamy
Sanjiban Choudhury
Wen Sun
Zhiwei Steven Wu
J. Andrew Bagnell
OffRL
415
43
0
03 Mar 2025
Diffusion Classifier-Driven Reward for Offline Preference-based Reinforcement Learning
Diffusion Classifier-Driven Reward for Offline Preference-based Reinforcement Learning
Teng Pang
Bingzheng Wang
Guoqiang Wu
Yilong Yin
OffRL
497
0
0
03 Mar 2025
Robust Multi-Objective Preference Alignment with Online DPOAAAI Conference on Artificial Intelligence (AAAI), 2025
Raghav Gupta
Ryan Sullivan
Yunxuan Li
Samrat Phatale
Abhinav Rastogi
195
9
0
01 Mar 2025
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
Jiaxin Deng
Shiyao Wang
Kuo Cai
Lejian Ren
Qigen Hu
Weifeng Ding
Qiang Luo
Guorui Zhou
283
83
0
26 Feb 2025
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Jiawei Huang
Bingcong Li
Christoph Dann
Niao He
OffRL
529
4
0
26 Feb 2025
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users
Anikait Singh
Sheryl Hsu
Kyle Hsu
E. Mitchell
Stefano Ermon
Tatsunori Hashimoto
Archit Sharma
Chelsea Finn
SyDaOffRL
290
15
0
26 Feb 2025
Self-rewarding correction for mathematical reasoning
Self-rewarding correction for mathematical reasoning
Wei Xiong
Hanning Zhang
Chenlu Ye
Lichang Chen
Nan Jiang
Tong Zhang
ReLMKELMLRM
404
39
0
26 Feb 2025
CuDIP: Enhancing Theorem Proving in LLMs via Curriculum Learning-based Direct Preference Optimization
CuDIP: Enhancing Theorem Proving in LLMs via Curriculum Learning-based Direct Preference Optimization
Shuming Shi
Ruobing Zuo
Gaolei He
Jianlin Wang
Chenyang Xu
Zhengfeng Yang
324
0
0
25 Feb 2025
AMPO: Active Multi-Preference Optimization for Self-play Preference Selection
AMPO: Active Multi-Preference Optimization for Self-play Preference Selection
Taneesh Gupta
Rahul Madhavan
Xuchao Zhang
Chetan Bansal
Saravan Rajmohan
314
0
0
25 Feb 2025
Larger or Smaller Reward Margins to Select Preferences for Alignment?
Kexin Huang
Junkang Wu
Ziqian Chen
Qingsong Wen
Jinyang Gao
Bolin Ding
Jiancan Wu
Xiangnan He
Xiang Wang
190
2
0
25 Feb 2025
Discriminative Finetuning of Generative Large Language Models without Reward Models and Human Preference Data
Discriminative Finetuning of Generative Large Language Models without Reward Models and Human Preference Data
Siqi Guo
Ilgee Hong
Vicente Balmaseda
Changlong Yu
Liang Qiu
Xin Liu
Haoming Jiang
Tuo Zhao
Tianbao Yang
315
0
0
25 Feb 2025
Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models
Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models
Xu Chu
Zhixin Zhang
Tianyu Jia
Yujie Jin
415
2
0
25 Feb 2025
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Yuheng Zhang
Dian Yu
Tao Ge
Linfeng Song
Zhichen Zeng
Haitao Mi
Nan Jiang
Dong Yu
288
10
0
24 Feb 2025
Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance
Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance
Chenghua Huang
Lu Wang
Fangkai Yang
Pu Zhao
Hao Sun
Qingwei Lin
Dongmei Zhang
Saravan Rajmohan
Qi Zhang
OffRL
215
3
0
24 Feb 2025
Post-edits Are Preferences Too
Post-edits Are Preferences TooConference on Machine Translation (WMT), 2024
Nathaniel Berger
Stefan Riezler
M. Exel
Matthias Huck
332
2
0
24 Feb 2025
Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding
Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding
Tian Jin
Ellie Y. Cheng
Zack Ankner
Nikunj Saunshi
Blake M. Elias
Amir Yazdanbakhsh
Jonathan Ragan-Kelley
Suvinay Subramanian
Michael Carbin
319
18
0
24 Feb 2025
C2-DPO: Constrained Controlled Direct Preference Optimization
C2-DPO: Constrained Controlled Direct Preference Optimization
Kavosh Asadi
Julien Han
Idan Pipano
Xingzi Xu
Dominique Perrault-Joncas
Shoham Sabach
Karim Bouyarmane
Mohammad Ghavamzadeh
293
0
0
22 Feb 2025
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
SimPER: A Minimalist Approach to Preference Alignment without HyperparametersInternational Conference on Learning Representations (ICLR), 2025
Teng Xiao
Yige Yuan
Ziyang Chen
Mingxiao Li
Shangsong Liang
Zhaochun Ren
V. Honavar
608
21
0
21 Feb 2025
Simplify RLHF as Reward-Weighted SFT: A Variational Method
Simplify RLHF as Reward-Weighted SFT: A Variational Method
Yuhao Du
Hui Yuan
Pengyu Cheng
Zhihong Chen
Yuejiao Xie
Xiang Wan
Anningzhe Gao
325
7
0
20 Feb 2025
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHFInternational Conference on Learning Representations (ICLR), 2024
Shicong Cen
Jincheng Mei
Katayoon Goshvadi
Hanjun Dai
Tong Yang
Sherry Yang
Dale Schuurmans
Yuejie Chi
Bo Dai
OffRL
581
57
0
20 Feb 2025
Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment
Faster WIND: Accelerating Iterative Best-of-NNN Distillation for LLM AlignmentInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Tong Yang
Jincheng Mei
H. Dai
Zixin Wen
Shicong Cen
Dale Schuurmans
Yuejie Chi
Bo Dai
342
6
0
20 Feb 2025
Bursting Filter Bubble: Enhancing Serendipity Recommendations with Aligned Large Language Models
Bursting Filter Bubble: Enhancing Serendipity Recommendations with Aligned Large Language Models
Yunjia Xi
Muyan Weng
Wen Chen
Chao Yi
Benlin Liu
...
Jian Wu
Yuning Jiang
Qingwen Liu
Yong Yu
Weinan Zhang
LRM
156
5
0
20 Feb 2025
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Shuo Xing
Peiran Li
Peiran Li
Ruizheng Bai
Longji Xu
Chan-wei Hu
Chengxuan Qian
Huaxiu Yao
Zhengzhong Tu
449
20
0
18 Feb 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Longxu Dou
Qian Liu
Fan Zhou
Changyu Chen
Zili Wang
...
Tianyu Pang
Chao Du
Xinyi Wan
Wei Lu
Jialin Li
424
8
0
18 Feb 2025
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Yongtao Wu
Luca Viano
Yihang Chen
Zhenyu Zhu
Kimon Antonakopoulos
Quanquan Gu
Volkan Cevher
484
2
0
18 Feb 2025
Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training
Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training
Fenghua Weng
Jian Lou
Jun Feng
Shiyu Huang
Wenjie Wang
AAML
335
6
0
17 Feb 2025
Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models
Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models
Yingshui Tan
Yilei Jiang
Yongbin Li
Qingbin Liu
Xingyuan Bu
Yuchi Xu
Xiangyu Yue
Xiaoyong Zhu
Bo Zheng
ALM
355
14
0
17 Feb 2025
Preference learning made easy: Everything should be understood through win rate
Preference learning made easy: Everything should be understood through win rate
Lily H. Zhang
Rajesh Ranganath
311
4
0
14 Feb 2025
PIPA: Preference Alignment as Prior-Informed Statistical Estimation
PIPA: Preference Alignment as Prior-Informed Statistical Estimation
Junbo Li
Zinan Lin
Qiang Liu
OffRL
397
0
0
09 Feb 2025
Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization
Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization
Yongcheng Zeng
Xinyu Cui
Xuanfa Jin
Guoqing Liu
Guoqing Liu
...
Ning Yang
Jun Wang
Jianye Hao
Haifeng Zhang
Jun Wang
LLMAGLRM
366
1
0
08 Feb 2025
Design Considerations in Offline Preference-based RL
Design Considerations in Offline Preference-based RL
Alekh Agarwal
Christoph Dann
T. V. Marinov
OffRL
303
1
0
08 Feb 2025
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Shehzeen Samarah Hussain
Paarth Neekhara
Xuesong Yang
Edresson Casanova
Subhankar Ghosh
Mikyas T. Desta
Roy Fejgin
Rafael Valle
Jason Chun Lok Li
359
19
0
07 Feb 2025
Extracting and Understanding the Superficial Knowledge in Alignment
Extracting and Understanding the Superficial Knowledge in AlignmentNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Runjin Chen
Gabriel Jacob Perin
Xuxi Chen
Xilun Chen
Y. Han
Nina S. T. Hirata
Junyuan Hong
B. Kailkhura
243
5
0
07 Feb 2025
Previous
123456...101112
Next