Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2509.01055
Cited By
v1
v2
v3 (latest)
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use
1 September 2025
Dongfu Jiang
Yi Lu
Zhuofeng Li
Zhiheng Lyu
Ping Nie
Haozhe Wang
Alex Su
Hui Chen
Kai Zou
Chao Du
Tianyu Pang
Wenhu Chen
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (61 upvotes)
Github (612★)
Papers citing
"VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use"
18 / 18 papers shown
Environment Scaling for Interactive Agentic Experience Collection: A Survey
Y. Huang
S. Li
Minghao Liu
Wei Liu
Shijue Huang
Zhiyuan Fan
Hou Pong Chan
Yi R. Fung
148
0
0
24 Dec 2025
On GRPO Collapse in Search-R1: The Lazy Likelihood-Displacement Death Spiral
Wenlong Deng
Yushu Li
Boying Gong
Yi Ren
Christos Thrampoulidis
Xiaoxiao Li
37
2
0
03 Dec 2025
Agentic Policy Optimization via Instruction-Policy Co-Evolution
Han Zhou
Xingchen Wan
Ivan Vulić
Anna Korhonen
99
0
0
01 Dec 2025
SkyRL-Agent: Efficient RL Training for Multi-turn LLM Agent
Shiyi Cao
Dacheng Li
Fangzhou Zhao
Shuo Yuan
Sumanth R. Hegde
...
Richard Liaw
Philipp Moritz
Matei A. Zaharia
Joseph E. Gonzalez
Ion Stoica
122
2
0
20 Nov 2025
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Peng Xia
K. Zeng
Jiaqi Liu
Can Qin
Fang Wu
Yiyang Zhou
Caiming Xiong
Huaxiu Yao
LLMAG
LM&Ro
SyDa
717
3
0
20 Nov 2025
The Path Not Taken: RLVR Provably Learns Off the Principals
Hanqing Zhu
Zhenyu Zhang
Hanxian Huang
DiJia Su
Zechun Liu
...
Jinwon Lee
David Z. Pan
Zinan Lin
Yuandong Tian
Kai Sheng Tai
191
3
0
11 Nov 2025
Scaling Agent Learning via Experience Synthesis
Zhaorun Chen
Zhuokai Zhao
Kai Zhang
Bo Liu
Qi Qi
...
Xian Li
Kurt Thomas
Bo Li
Jason Weston
Dat Huynh
OffRL
CLL
456
1
0
05 Nov 2025
Sharpness-Controlled Group Relative Policy Optimization with Token-Level Probability Shaping
Tue Le
Nghi D.Q.Bui
Linh Ngo Van
Trung Le
136
0
0
29 Oct 2025
Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
Ran Xu
Jingjing Chen
Jiayu Ye
Yu Wu
Jun Yan
Carl Yang
Hongkun Yu
ELM
LRM
238
5
0
27 Oct 2025
DeepAgent: A General Reasoning Agent with Scalable Toolsets
Xiaoxi Li
Wenxiang Jiao
Jiarui Jin
Guanting Dong
Jiajie Jin
...
H. Wang
Yutao Zhu
Ji-Rong Wen
Yuan Lu
Zhicheng Dou
LLMAG
LRM
130
7
0
24 Oct 2025
A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications
Minhua Lin
Zongyu Wu
Zhichao Xu
Hui Liu
Xianfeng Tang
Qi He
Charu C. Aggarwal
Hui Liu
Xiang Zhang
Suhang Wang
AI4TS
LRM
555
1
0
19 Oct 2025
Agentic Entropy-Balanced Policy Optimization
Guanting Dong
Licheng Bao
Zhongyuan Wang
Kangzhi Zhao
Xiaoxi Li
...
Kun Gai
Guorui Zhou
Yutao Zhu
Ji-Rong Wen
Zhicheng Dou
91
2
0
16 Oct 2025
Reinforcement Learning for Tool-Integrated Interleaved Thinking towards Cross-Domain Generalization
Zhengyu Chen
Jinluan Yang
Teng Xiao
Ruochen Zhou
L. Zhang
Xiangyu Xi
X. Shi
Wei Wang
Jinggang Wang
LRM
116
1
0
13 Oct 2025
BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions
Tao Yu
Zhengbo Zhang
Zhiheng Lyu
Junhao Gong
Hongzhu Yi
...
Yuxuan Zhou
J. Yang
Ping Nie
Yan Huang
Wenhu Chen
LLMAG
LM&Ro
194
1
0
12 Oct 2025
AlphaApollo: Orchestrating Foundation Models and Professional Tools into a Self-Evolving System for Deep Agentic Reasoning
Zhanke Zhou
Chentao Cao
Xiao Feng
Xuan Li
Zongze Li
...
Brando Miranda
Tongliang Liu
Sanmi Koyejo
Masashi Sugiyama
Bo Han
ReLM
LRM
112
0
0
05 Oct 2025
QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL
Cong Yu
Valter Uotila
Shilong Deng
Qingyuan Wu
Tuo Shi
Songlin Jiang
Lei You
Bo Zhao
130
2
0
01 Oct 2025
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
Renjie Luo
Zichen Liu
Xiangyan Liu
Chao Du
Min Lin
Wenhu Chen
Wei Lu
Tianyu Pang
OffRL
153
3
0
26 Sep 2025
Variational Reasoning for Language Models
Xiangxin Zhou
Zichen Liu
Haonan Wang
Chao Du
Min Lin
Chongxuan Li
Liang Wang
Tianyu Pang
OffRL
LRM
202
0
0
26 Sep 2025
1