Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.17256
Cited By
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
23 December 2024
Weihao Zeng
Yuzhen Huang
Lulu Zhao
Yijun Wang
Zifei Shan
Junxian He
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners"
5 / 5 papers shown
Title
WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model
Tianqing Fang
H. M. Zhang
Z. Zhang
Kaixin Ma
W. Yu
Haitao Mi
Dong Yu
LLMAG
KELM
159
0
0
23 Apr 2025
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
Xiangyan Liu
Jinjie Ni
Zijian Wu
Chao Du
Longxu Dou
Haoran Wang
Tianyu Pang
Michael Shieh
OffRL
LRM
140
0
0
17 Apr 2025
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
FangZhi Xu
Hang Yan
Chang Ma
Haiteng Zhao
Qiushi Sun
Kanzhi Cheng
Junxian He
Jun Liu
Zhiyong Wu
LRM
29
1
0
11 Apr 2025
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng
Yuzhen Huang
Qian Liu
Wei Liu
Keqing He
Zejun Ma
Junxian He
OffRL
ReLM
LRM
91
38
0
24 Mar 2025
SOLAR: Scalable Optimization of Large-scale Architecture for Reasoning
Chen Li
Yinyi Luo
Anudeep Bolimera
Uzair Ahmed
Shri Kiran Srinivasan
Hrishikesh Gokhale
Marios Savvides
LRM
AI4CE
60
1
0
06 Mar 2025
1