ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.05006
  4. Cited By
Provable Multi-Party Reinforcement Learning with Diverse Human Feedback

Provable Multi-Party Reinforcement Learning with Diverse Human Feedback

8 March 2024
Huiying Zhong
Zhun Deng
Weijie J. Su
Zhiwei Steven Wu
Linjun Zhang
ArXiv (abs)PDFHTML

Papers citing "Provable Multi-Party Reinforcement Learning with Diverse Human Feedback"

20 / 20 papers shown
Title
Preference-Based Dynamic Ranking Structure Recognition
Preference-Based Dynamic Ranking Structure Recognition
Nan Lu
Jian Shi
Xin-Yu Tian
80
0
0
29 Sep 2025
Theoretical Tensions in RLHF: Reconciling Empirical Success with Inconsistencies in Social Choice Theory
Theoretical Tensions in RLHF: Reconciling Empirical Success with Inconsistencies in Social Choice Theory
Jiancong Xiao
Zhekun Shi
Kaizhao Liu
Q. Long
Weijie J. Su
167
3
0
14 Jun 2025
Beyond RLHF and NLHF: Population-Proportional Alignment under an Axiomatic Framework
Beyond RLHF and NLHF: Population-Proportional Alignment under an Axiomatic Framework
Kihyun Kim
Jiawei Zhang
Asuman Ozdaglar
P. Parrilo
204
1
0
05 Jun 2025
Doubly Robust Alignment for Large Language Models
Doubly Robust Alignment for Large Language Models
Erhan Xu
Kai Ye
Hongyi Zhou
Luhan Zhu
Francesco Quinzan
Chengchun Shi
268
2
0
01 Jun 2025
Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?
Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?
Paul Gölz
Nika Haghtalab
Kunhe Yang
155
6
0
29 May 2025
Is Active Persona Inference Necessary for Aligning Small Models to Personal Preferences?
Is Active Persona Inference Necessary for Aligning Small Models to Personal Preferences?
Zilu Tang
Afra Feyza Akyürek
Ekin Akyürek
Derry Wijaya
325
0
0
19 May 2025
Metric Distortion for Tournament Voting and Beyond
Metric Distortion for Tournament Voting and BeyondACM Conference on Economics and Computation (EC), 2025
Moses Charikar
Prasanna Ramakrishnan
Zihan Tan
Kangning Wang
154
1
0
19 May 2025
Learning Guarantee of Reward Modeling Using Deep Neural Networks
Learning Guarantee of Reward Modeling Using Deep Neural Networks
Yuanhang Luo
Yeheng Ge
Ruijian Han
Guohao Shen
171
1
0
10 May 2025
Contextual Online Uncertainty-Aware Preference Learning for Human Feedback
Contextual Online Uncertainty-Aware Preference Learning for Human Feedback
Nan Lu
Ethan X. Fang
Junwei Lu
953
1
0
27 Apr 2025
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Kai Ye
Hongyi Zhou
Jin Zhu
Francesco Quinzan
C. Shi
351
4
0
03 Apr 2025
Capturing Individual Human Preferences with Reward Features
Capturing Individual Human Preferences with Reward Features
André Barreto
Vincent Dumoulin
Yiran Mao
Nicolas Perez-Nieves
Bobak Shahriari
Yann Dauphin
Doina Precup
Hugo Larochelle
ALM
206
4
0
21 Mar 2025
Strategyproof Reinforcement Learning from Human Feedback
Strategyproof Reinforcement Learning from Human Feedback
Thomas Kleine Buening
Jiarui Gan
Debmalya Mandal
Marta Z. Kwiatkowska
215
2
0
12 Mar 2025
MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment
MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment
Tianze Wang
Dongnan Gui
Yifan Hu
Shuhang Lin
Linjun Zhang
436
3
0
25 Feb 2025
Clone-Robust AI Alignment
Ariel D. Procaccia
Benjamin G. Schiffer
Shirley Zhang
160
5
0
17 Jan 2025
Policy Aggregation
Policy AggregationNeural Information Processing Systems (NeurIPS), 2024
Parand A. Alamdari
Soroush Ebadian
Ariel D. Procaccia
OffRL
180
8
0
06 Nov 2024
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization
Jiancong Xiao
Ziniu Li
Xingyu Xie
E. Getzen
Cong Fang
Qi Long
Weijie J. Su
258
42
0
26 May 2024
Axioms for AI Alignment from Human Feedback
Axioms for AI Alignment from Human FeedbackNeural Information Processing Systems (NeurIPS), 2024
Luise Ge
Daniel Halpern
Evi Micha
Ariel D. Procaccia
Itai Shapira
Yevgeniy Vorobeychik
Junlin Wu
175
36
0
23 May 2024
Direct Preference Optimization With Unobserved Preference Heterogeneity: The Necessity of Ternary Preferences
Direct Preference Optimization With Unobserved Preference Heterogeneity: The Necessity of Ternary Preferences
Keertana Chidambaram
Karthik Vinay Seetharaman
Vasilis Syrgkanis
351
11
0
23 May 2024
Fairness in Reinforcement Learning: A Survey
Fairness in Reinforcement Learning: A Survey
Anka Reuel
Devin Ma
OffRLFaML
213
13
0
11 May 2024
RLHF from Heterogeneous Feedback via Personalization and Preference
  Aggregation
RLHF from Heterogeneous Feedback via Personalization and Preference Aggregation
Chanwoo Park
Mingyang Liu
Dingwen Kong
Kaiqing Zhang
Asuman Ozdaglar
330
56
0
30 Apr 2024
1