Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.04548
Cited By
An Empirical Study on Eliciting and Improving R1-like Reasoning Models
6 March 2025
Z. Chen
Yingqian Min
Beichen Zhang
Jie Chen
Jinhao Jiang
Daixuan Cheng
Wayne Xin Zhao
Zheng Liu
Xu Miao
Y. Lu
Lei Fang
Zhongyuan Wang
Ji-Rong Wen
ReLM
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Empirical Study on Eliciting and Improving R1-like Reasoning Models"
12 / 12 papers shown
Title
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
X. Li
Jiajie Jin
Guanting Dong
Hongjin Qian
Yutao Zhu
Yongkang Wu
Ji-Rong Wen
Zhicheng Dou
LLMAG
LRM
82
1
0
30 Apr 2025
SRPO: A Cross-Domain Implementation of Large-Scale Reinforcement Learning on LLM
X. Zhang
J. Wang
Zifei Cheng
Wenhao Zhuang
Zheng Lin
...
Shouyu Yin
Chaohang Wen
Haotian Zhang
Bin Chen
Bing Yu
LRM
33
2
0
19 Apr 2025
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
Yixuan Even Xu
Yash Savani
Fei Fang
Zico Kolter
OffRL
24
1
0
18 Apr 2025
Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT?
Yiyou Sun
Georgia Zhou
H. Wang
D. Li
Nouha Dziri
Dawn Song
ReLM
ALM
ELM
LRM
72
0
1
16 Apr 2025
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Jiazhan Feng
Shijue Huang
Xingwei Qu
Ge Zhang
Yujia Qin
Baoquan Zhong
Chengquan Jiang
Jinxin Chi
Wanjun Zhong
OffRL
ReLM
SyDa
KELM
LRM
54
4
0
15 Apr 2025
Slow Thinking for Sequential Recommendation
Junjie Zhang
Beichen Zhang
Wenqi Sun
Hongyu Lu
Wayne Xin Zhao
Yu Chen
Ji-Rong Wen
OffRL
LRM
28
0
0
13 Apr 2025
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Hardy Chen
Haoqin Tu
Fali Wang
Hui Liu
X. Tang
Xinya Du
Yuyin Zhou
Cihang Xie
ReLM
VLM
OffRL
LRM
60
6
0
10 Apr 2025
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Taiwei Shi
Yiyang Wu
Linxin Song
Tianyi Zhou
Jieyu Zhao
LRM
76
1
0
07 Apr 2025
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
Haoxiang Sun
Yingqian Min
Z. Chen
Wayne Xin Zhao
Zheng Liu
Z. Wang
Lei Fang
Ji-Rong Wen
ELM
LRM
42
1
0
27 Mar 2025
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng
Yuzhen Huang
Qian Liu
Wei Liu
Keqing He
Zejun Ma
Junxian He
OffRL
ReLM
LRM
91
28
0
24 Mar 2025
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Qiying Yu
Z. Zhang
Ruofei Zhu
Yufeng Yuan
Xiaochen Zuo
...
Ya-Qin Zhang
Lin Yan
Mu Qiao
Yonghui Wu
Mingxuan Wang
OffRL
LRM
64
41
0
18 Mar 2025
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Huatong Song
Jinhao Jiang
Yingqian Min
Jie Chen
Z. Chen
Wayne Xin Zhao
Lei Fang
Ji-Rong Wen
AI4TS
LRM
KELM
93
6
0
07 Mar 2025
1