ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.01729
  4. Cited By
Evaluating Robustness of Reward Models for Mathematical Reasoning

Evaluating Robustness of Reward Models for Mathematical Reasoning

2 October 2024
Sunghwan Kim
Dongjin Kang
Taeyoon Kwon
Hyungjoo Chae
Jungsoo Won
Dongha Lee
Jinyoung Yeo
ArXiv (abs)PDFHTML

Papers citing "Evaluating Robustness of Reward Models for Mathematical Reasoning"

9 / 9 papers shown
Title
EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preferences
EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preferences
Kshitish Ghate
Andy Liu
Devansh Jain
Taylor Sorensen
Atoosa Kasirzadeh
Aylin Caliskan
Mona Diab
Maarten Sap
LLMSV
257
0
0
07 Oct 2025
Why is Your Language Model a Poor Implicit Reward Model?
Why is Your Language Model a Poor Implicit Reward Model?
Noam Razin
Yong Lin
Jiarui Yao
Sanjeev Arora
LRM
187
0
0
10 Jul 2025
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models
Yuchen Yan
Jin Jiang
Zhenbang Ren
Yijun Li
Xudong Cai
...
Mengdi Zhang
Jian Shao
Yongliang Shen
Jun Xiao
Yueting Zhuang
OffRLALMLRM
336
8
0
21 May 2025
On the Robustness of Reward Models for Language Model Alignment
On the Robustness of Reward Models for Language Model Alignment
Jiwoo Hong
Noah Lee
Eunki Kim
Guijin Son
Woojin Chung
Aman Gupta
Shao Tang
Hyunjung Shim
229
5
0
12 May 2025
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
Ting-Ruen Wei
Haowei Liu
Xuyang Wu
Yi Fang
LRMAI4CEReLMKELM
693
8
0
21 Feb 2025
Uncovering Factor Level Preferences to Improve Human-Model Alignment
Uncovering Factor Level Preferences to Improve Human-Model AlignmentConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Juhyun Oh
Eunsu Kim
Jiseon Kim
Wenda Xu
Inha Cha
William Yang Wang
Alice Oh
266
1
0
09 Oct 2024
Evaluating Mathematical Reasoning Beyond Accuracy
Evaluating Mathematical Reasoning Beyond Accuracy
Shijie Xia
Xuefeng Li
Yixin Liu
Tongshuang Wu
Pengfei Liu
LRMReLM
248
50
0
08 Apr 2024
Self-Rewarding Language Models
Self-Rewarding Language Models
Weizhe Yuan
Richard Yuanzhe Pang
Kyunghyun Cho
Xian Li
Sainbayar Sukhbaatar
Jing Xu
Jason Weston
ReLMSyDaALMLRM
810
440
0
18 Jan 2024
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-InstructInternational Conference on Learning Representations (ICLR), 2023
Haipeng Luo
Qingfeng Sun
Can Xu
Lu Wang
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRMOSLM
724
606
0
18 Aug 2023
1