ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.11287
  4. Cited By
Process Reward Model with Q-Value Rankings

Process Reward Model with Q-Value Rankings

15 October 2024
W. Li
Yixuan Li
    LRM
ArXivPDFHTML

Papers citing "Process Reward Model with Q-Value Rankings"

12 / 12 papers shown
Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
60
0
0
05 May 2025
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Shaokun Zhang
Yi Dong
Jieyu Zhang
Jan Kautz
Bryan Catanzaro
Andrew Tao
Qingyun Wu
Zhiding Yu
Guilin Liu
LLMAG
OffRL
KELM
LRM
83
0
0
25 Apr 2025
Efficient Process Reward Model Training via Active Learning
Efficient Process Reward Model Training via Active Learning
Keyu Duan
Zichen Liu
Xin Mao
Tianyu Pang
Changyu Chen
Qiguang Chen
Michael Shieh
Longxu Dou
LRM
12
1
0
14 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
27
1
0
12 Apr 2025
Rank-Then-Score: Enhancing Large Language Models for Automated Essay Scoring
Rank-Then-Score: Enhancing Large Language Models for Automated Essay Scoring
Yida Cai
Kun Liang
Sanwoo Lee
Qinghan Wang
Yunfang Wu
ALM
46
1
0
08 Apr 2025
VideoAgent2: Enhancing the LLM-Based Agent System for Long-Form Video Understanding by Uncertainty-Aware CoT
VideoAgent2: Enhancing the LLM-Based Agent System for Long-Form Video Understanding by Uncertainty-Aware CoT
Zhuo Zhi
Qiangqiang Wu
Minghe shen
W. J. Li
Yinchuan Li
Kun Shao
Kaiwen Zhou
LLMAG
23
0
0
06 Apr 2025
Process Reward Modeling with Entropy-Driven Uncertainty
Process Reward Modeling with Entropy-Driven Uncertainty
Lang Cao
Renhong Chen
Yingtian Zou
Chao Peng
Wu Ning
...
Y. Wang
Peishuo Su
Mofan Peng
Zijie Chen
Yitong Li
29
0
0
28 Mar 2025
Video-R1: Reinforcing Video Reasoning in MLLMs
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng
Kaixiong Gong
B. Li
Zonghao Guo
Yibing Wang
Tianshuo Peng
Benyou Wang
Xiangyu Yue
SyDa
AI4TS
LRM
35
13
0
27 Mar 2025
Process-Supervised LLM Recommenders via Flow-guided Tuning
Process-Supervised LLM Recommenders via Flow-guided Tuning
Chongming Gao
Mengyao Gao
Chenxiao Fan
Shuai Yuan
Wentao Shi
Xiangnan He
68
2
0
10 Mar 2025
FANS -- Formal Answer Selection for Natural Language Math Reasoning Using Lean4
Jiarui Yao
Ruida Wang
Tong Zhang
LRM
42
0
0
05 Mar 2025
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
Ting-Ruen Wei
Haowei Liu
Xuyang Wu
Yi Fang
LRM
AI4CE
ReLM
KELM
91
1
0
21 Feb 2025
Coarse-to-Fine Process Reward Modeling for Mathematical Reasoning
Coarse-to-Fine Process Reward Modeling for Mathematical Reasoning
Y. Hu
Sheng Ouyang
Yong Liu
LRM
24
1
0
23 Jan 2025
1