ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.09893
  4. Cited By
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment

RMB: Comprehensively Benchmarking Reward Models in LLM Alignment

13 October 2024
Enyu Zhou
Guodong Zheng
B. Wang
Zhiheng Xi
Shihan Dou
Rong Bao
Wei Shen
Limao Xiong
Jessica Fan
Yurong Mou
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
    ALM
ArXivPDFHTML

Papers citing "RMB: Comprehensively Benchmarking Reward Models in LLM Alignment"

11 / 11 papers shown
Title
RM-R1: Reward Modeling as Reasoning
RM-R1: Reward Modeling as Reasoning
X. Chen
Gaotang Li
Z. Wang
Bowen Jin
Cheng Qian
...
Y. Zhang
D. Zhang
Tong Zhang
Hanghang Tong
Heng Ji
ReLM
OffRL
LRM
35
0
0
05 May 2025
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators
Yilun Zhou
Austin Xu
Peifeng Wang
Caiming Xiong
Shafiq R. Joty
ELM
ALM
LRM
38
1
0
21 Apr 2025
Energy-Based Reward Models for Robust Language Model Alignment
Energy-Based Reward Models for Robust Language Model Alignment
Anamika Lochab
Ruqi Zhang
36
0
0
17 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
29
1
0
12 Apr 2025
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Minghe Gao
Xuqi Liu
Zhongqi Yue
Y. Wu
Shuang Chen
Juncheng Billy Li
Siliang Tang
Fei Wu
Tat-Seng Chua
Yueting Zhuang
OffRL
LRM
31
1
0
09 Apr 2025
Inference-Time Scaling for Generalist Reward Modeling
Inference-Time Scaling for Generalist Reward Modeling
Zijun Liu
P. Wang
R. Xu
Shirong Ma
Chong Ruan
Peng Li
Yang Janet Liu
Y. Wu
OffRL
LRM
44
9
0
03 Apr 2025
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
Weiyun Wang
Zhangwei Gao
L. Chen
Zhe Chen
Jinguo Zhu
...
Lewei Lu
Haodong Duan
Yu Qiao
Jifeng Dai
Wenhai Wang
LRM
48
9
0
13 Mar 2025
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Yuhang Zang
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Ziyu Liu
...
Haodong Duan
W. Zhang
Kai Chen
D. Lin
Jiaqi Wang
VLM
65
17
0
21 Jan 2025
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented
  Generation for Preference Alignment
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
Zhuoran Jin
Hongbang Yuan
Tianyi Men
Pengfei Cao
Yubo Chen
Kang-Jun Liu
Jun Zhao
ALM
79
7
0
18 Dec 2024
VLRewardBench: A Challenging Benchmark for Vision-Language Generative
  Reward Models
VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models
Lei Li
Y. X. Wei
Zhihui Xie
Xuqing Yang
Yifan Song
...
Tianyu Liu
Sujian Li
Bill Yuchen Lin
Lingpeng Kong
Q. Liu
CoGe
VLM
107
19
0
26 Nov 2024
M-RewardBench: Evaluating Reward Models in Multilingual Settings
M-RewardBench: Evaluating Reward Models in Multilingual Settings
Srishti Gureja
Lester James Validad Miranda
Shayekh Bin Islam
Rishabh Maheshwary
Drishti Sharma
Gusti Winata
Nathan Lambert
Sebastian Ruder
Sara Hooker
Marzieh Fadaee
LRM
32
12
0
20 Oct 2024
1