ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.16635
  4. Cited By
Improving Reinforcement Learning from Human Feedback with Efficient
  Reward Model Ensemble

Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble

30 January 2024
Shun Zhang
Zhenfang Chen
Sunli Chen
Yikang Shen
Zhiqing Sun
Chuang Gan
ArXivPDFHTML

Papers citing "Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble"

28 / 28 papers shown
Title
Energy-Based Reward Models for Robust Language Model Alignment
Energy-Based Reward Models for Robust Language Model Alignment
Anamika Lochab
Ruqi Zhang
53
0
0
17 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
38
1
0
12 Apr 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
38
0
0
13 Mar 2025
Reward Shaping to Mitigate Reward Hacking in RLHF
Reward Shaping to Mitigate Reward Hacking in RLHF
Jiayi Fu
Xuandong Zhao
Chengyuan Yao
H. Wang
Qi Han
Yanghua Xiao
80
6
0
26 Feb 2025
Scaling Autonomous Agents via Automatic Reward Modeling And Planning
Scaling Autonomous Agents via Automatic Reward Modeling And Planning
Zhenfang Chen
Delin Chen
Rui Sun
Wenjun Liu
Chuang Gan
LLMAG
58
3
0
17 Feb 2025
Ensembles of Low-Rank Expert Adapters
Ensembles of Low-Rank Expert Adapters
Yinghao Li
Vianne Gao
Chao Zhang
MohamadAli Torkamani
60
0
0
31 Jan 2025
CREAM: Consistency Regularized Self-Rewarding Language Models
CREAM: Consistency Regularized Self-Rewarding Language Models
Z. Wang
Weilei He
Zhiyuan Liang
Xuchao Zhang
Chetan Bansal
Ying Wei
Weitong Zhang
Huaxiu Yao
ALM
96
7
0
16 Oct 2024
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed
  Bandits
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits
Duy Nguyen
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
23
2
0
02 Oct 2024
Post-hoc Reward Calibration: A Case Study on Length Bias
Post-hoc Reward Calibration: A Case Study on Length Bias
Zeyu Huang
Zihan Qiu
Zili Wang
Edoardo M. Ponti
Ivan Titov
36
5
0
25 Sep 2024
Reward-Robust RLHF in LLMs
Reward-Robust RLHF in LLMs
Yuzi Yan
Xingzhou Lou
Jialian Li
Yiping Zhang
Jian Xie
Chao Yu
Yu Wang
Dong Yan
Yuan Shen
37
7
0
18 Sep 2024
Policy Filtration in RLHF to Fine-Tune LLM for Code Generation
Policy Filtration in RLHF to Fine-Tune LLM for Code Generation
Wei Shen
Chuheng Zhang
OffRL
30
6
0
11 Sep 2024
On the Limited Generalization Capability of the Implicit Reward Model
  Induced by Direct Preference Optimization
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Yong Lin
Skyler Seto
Maartje ter Hoeve
Katherine Metcalf
B. Theobald
Xuan Wang
Yizhe Zhang
Chen Huang
Tong Zhang
29
12
0
05 Sep 2024
Towards a Unified View of Preference Learning for Large Language Models:
  A Survey
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Bofei Gao
Feifan Song
Yibo Miao
Zefan Cai
Z. Yang
...
Houfeng Wang
Zhifang Sui
Peiyi Wang
Baobao Chang
Baobao Chang
41
11
0
04 Sep 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in
  the Era of Large Language Models
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
27
17
0
08 Jul 2024
A Survey on LoRA of Large Language Models
A Survey on LoRA of Large Language Models
Yuren Mao
Yuhang Ge
Yijiang Fan
Wenyi Xu
Yu Mi
Zhonghao Hu
Yunjun Gao
ALM
52
22
0
08 Jul 2024
Towards Comprehensive Preference Data Collection for Reward Modeling
Towards Comprehensive Preference Data Collection for Reward Modeling
Yulan Hu
Qingyang Li
Sheng Ouyang
Ge Chen
Kaihui Chen
Lijun Mei
Xucheng Ye
Fuzheng Zhang
Yong Liu
SyDa
32
4
0
24 Jun 2024
Effective Generative AI: The Human-Algorithm Centaur
Effective Generative AI: The Human-Algorithm Centaur
S. Saghafian
Lihi Idan
38
7
0
16 Jun 2024
Regularizing Hidden States Enables Learning Generalizable Reward Model
  for LLMs
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
Rui Yang
Ruomeng Ding
Yong Lin
Huan Zhang
Tong Zhang
21
42
0
14 Jun 2024
Scalable Ensembling For Mitigating Reward Overoptimisation
Scalable Ensembling For Mitigating Reward Overoptimisation
Ahmed M. Ahmed
Rafael Rafailov
Stepan Sharkov
Xuechen Li
Oluwasanmi Koyejo
24
5
0
03 Jun 2024
Offline Regularised Reinforcement Learning for Large Language Models
  Alignment
Offline Regularised Reinforcement Learning for Large Language Models Alignment
Pierre Harvey Richemond
Yunhao Tang
Daniel Guo
Daniele Calandriello
M. G. Azar
...
Gil Shamir
Rishabh Joshi
Tianqi Liu
Rémi Munos
Bilal Piot
OffRL
40
21
0
29 May 2024
ALaRM: Align Language Models via Hierarchical Rewards Modeling
ALaRM: Align Language Models via Hierarchical Rewards Modeling
Yuhang Lai
Siyuan Wang
Shujun Liu
Xuanjing Huang
Zhongyu Wei
16
4
0
11 Mar 2024
Bayesian Reward Models for LLM Alignment
Bayesian Reward Models for LLM Alignment
Adam X. Yang
Maxime Robeyns
Thomas Coste
Zhengyan Shi
Jun Wang
Haitham Bou-Ammar
Laurence Aitchison
32
17
0
20 Feb 2024
WARM: On the Benefits of Weight Averaged Reward Models
WARM: On the Benefits of Weight Averaged Reward Models
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
100
92
0
22 Jan 2024
Reinforcement Learning for Generative AI: A Survey
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
39
10
0
28 Aug 2023
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
2,712
0
24 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
321
1,944
0
04 May 2020
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
268
5,635
0
05 Dec 2016
1