Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.16421
Cited By
v1
v2 (latest)
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning
22 May 2025
Zhepei Wei
Wenlin Yao
Yao Liu
Weizhi Zhang
Qin Lu
Liang Qiu
Changlong Yu
Puyang Xu
Chao Zhang
Bing Yin
Hyokun Yun
Lihong Li
OffRL
CLL
OnRL
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (19 upvotes)
Github
Papers citing
"WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning"
49 / 49 papers shown
GTM: Simulating the World of Tools for AI Agents
Zhenzhen Ren
Xinpeng Zhang
Zhenxing Qian
Yan Gao
Yu Shi
Shuxin Zheng
Jiyan He
LLMAG
263
3
0
04 Dec 2025
Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation
Zehao Deng
Tianjie Ju
Zheng Wu
Zhuosheng Zhang
Gongshen Liu
OffRL
120
0
0
27 Nov 2025
OpenApps: Simulating Environment Variations to Measure UI-Agent Reliability
Karen Ullrich
Jingtong Su
Claudia Shi
Arjun Subramonian
Amir Bar
Ivan Evtimov
Nikolaos Tsilivis
Randall Balestriero
Julia Kempe
Mark Ibrahim
213
1
0
25 Nov 2025
Stabilizing Off-Policy Training for Long-Horizon LLM Agent via Turn-Level Importance Sampling and Clipping-Triggered Normalization
Chenliang Li
Adel Elmahdy
Alex Boyd
Zhongruo Wang
Alfredo García
Parminder Bhatia
Taha A. Kass-Hout
Cao Xiao
Mingyi Hong
Mingyi Hong
OffRL
268
1
0
25 Nov 2025
WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance
Genglin Liu
Shijie Geng
Sha Li
Hejie Cui
Sarah Zhang
Xin Liu
Tianyi Liu
CLL
789
5
0
17 Nov 2025
SynthAgent: Adapting Web Agents with Synthetic Supervision
Zhaoyang Wang
Yiming Liang
Xuchao Zhang
Qianhui Wu
Siwei Han
...
Chetan Bansal
Baolin Peng
J. Gao
Saravan Rajmohan
Huaxiu Yao
184
4
0
08 Nov 2025
Scaling Agent Learning via Experience Synthesis
Zhaorun Chen
Zhuokai Zhao
Kai Zhang
Bo Liu
Qi Qi
...
Xian Li
Kurt Thomas
Bo Li
Jason Weston
Dat Huynh
OffRL
CLL
568
10
0
05 Nov 2025
Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation
Zhiwei Zhang
Xiaomin Li
Yudi Lin
Hui Liu
Ramraj Chandradevan
...
Minhua Lin
Fali Wang
Xianfeng Tang
Qi He
Suhang Wang
LLMAG
LRM
298
6
0
04 Nov 2025
Optimizing Retrieval for RAG via Reinforcement Learning
Jiawei Zhou
Lei Chen
205
1
0
28 Oct 2025
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
Yaochen Zhu
Harald Steck
Dawen Liang
Yinhan He
Vito Ostuni
Jundong Li
Nathan Kallus
343
4
0
23 Oct 2025
Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains
Austin Xu
Xuan-Phi Nguyen
Yilun Zhou
Chien-Sheng Wu
Caiming Xiong
Shafiq Joty
OffRL
ALM
LRM
ELM
278
3
0
20 Oct 2025
A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications
Minhua Lin
Zongyu Wu
Zhichao Xu
Hui Liu
Xianfeng Tang
Qi He
Charu C. Aggarwal
Hui Liu
Xiang Zhang
Suhang Wang
AI4TS
LRM
636
9
0
19 Oct 2025
WEBSERV: A Browser-Server Environment for Efficient Training of Reinforcement Learning-based Web Agents at Scale
Yuxuan Lu
Jing Huang
Hui Liu
Jiri Gesi
Yan Han
Shihan Fu
Tianqi Zheng
Dakuo Wang
OffRL
141
3
0
17 Oct 2025
Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents
Rui Wang
Ce Zhang
Jun-Yu Ma
Jianshu Zhang
Hongru Wang
...
Z. Zhang
Hongming Zhang
Haitao Mi
Dong Yu
Kam-Fai Wong
168
2
0
16 Oct 2025
Towards Agentic Self-Learning LLMs in Search Environment
Wangtao Sun
Xiang Cheng
Jialin Fan
Yao Xu
Xing Yu
Shizhu He
Jun Zhao
Kang Liu
211
4
0
16 Oct 2025
DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping
Wei Fan
Wenlin Yao
Zheng Li
Feng Yao
Xin Liu
Liang Qiu
Qingyu Yin
Yangqiu Song
Bing Yin
LLMAG
OffRL
175
1
0
14 Oct 2025
A Survey on Agentic Multimodal Large Language Models
Huanjin Yao
Ruifei Zhang
Jiaxing Huang
Jingyi Zhang
Yibo Wang
...
Ruolin Zhu
Yongcheng Jing
Shunyu Liu
Guanbin Li
Dacheng Tao
LM&Ro
AIFin
AI4TS
LRM
AI4CE
306
12
0
13 Oct 2025
Deep Research with Open-Domain Evaluation and Multi-Stage Guardrails for Safety
Wei-Chieh Huang
Henry Peng Zou
Y. Wu
Dongyuan Li
Yankai Chen
...
Liancheng Fang
Langzhou He
Renhe Jiang
Philip S. Yu
Philip S. Yu
248
2
0
13 Oct 2025
MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training
Taicheng Guo
Hai Wang
Chaochun Liu
Mohsen Golalikhani
Xin Chen
Xiangliang Zhang
Chandan K. Reddy
LRM
150
2
0
12 Oct 2025
Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics
Lianhao Zhou
Hongyi Ling
Cong Fu
Yepeng Huang
Michael Sun
...
X. Qian
Heng Ji
Wei Wang
Marinka Zitnik
Shuiwang Ji
LLMAG
LM&Ro
AI4CE
231
6
0
10 Oct 2025
Dyna-Mind: Learning to Simulate from Experience for Better AI Agents
Xiao Yu
Baolin Peng
Michel Galley
Hao Cheng
Qianhui Wu
Janardhan Kulkarni
Suman Nath
Zhou Yu
Jianfeng Gao
LRM
AI4CE
148
2
0
10 Oct 2025
Agent Learning via Early Experience
Kai Zhang
Xiangchao Chen
Bo Liu
Tianci Xue
Zeyi Liao
...
J. Zhu
Huan Sun
Jason Weston
Eric Fosler-Lussier
Y. Wu
OffRL
243
32
0
09 Oct 2025
Customer-R1: Personalized Simulation of Human Behaviors via RL-based LLM Agent in Online Shopping
Ziyi Wang
Yuxuan Lu
Yimeng Zhang
Jing Huang
Dakuo Wang
247
7
0
08 Oct 2025
Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents
Yiding Wang
Zhepei Wei
Xinyu Zhu
Yu Meng
203
3
0
06 Oct 2025
JEF-Hinter: Leveraging Offline Knowledge for Improving Web Agents Adaptation
Hadi Nekoei
Aman Jaiswal
Patrice Béchard
Oleh Shliazhko
Orlando Marquez Ayala
Mathieu Reymond
Massimo Caccia
Alexandre Drouin
Sarath Chandar
Alexandre Lacoste
KELM
175
2
0
05 Oct 2025
Gradient Coupling: The Hidden Barrier to Generalization in Agentic Reinforcement Learning
Jingyu Liu
xiaopeng Wu
Jingquan Peng
Kehan Chen
Chuan Yu
Lizhong Ding
Yong Liu
244
0
0
28 Sep 2025
WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning
Zimu Lu
Houxing Ren
Yunqiao Yang
Ke Wang
Zhuofan Zong
Junting Pan
Mingjie Zhan
Jiaming Song
LLMAG
176
1
0
26 Sep 2025
Agentic Reinforcement Learning with Implicit Step Rewards
Xiaoqian Liu
Ke Wang
Yuchuan Wu
Fei Huang
Y. Li
Junge Zhang
Jianbin Jiao
OffRL
308
0
0
23 Sep 2025
ARE: Scaling Up Agent Environments and Evaluations
Pierre Andrews
Amine Benhalloum
Gerard Moreno-Torres Bertran
Matteo Bettini
Amar Budhiraja
...
Andrey Rusakov
Thomas Scialom
Vladislav Vorotilov
Mengjue Wang
Ian Yu
LLMAG
528
14
0
21 Sep 2025
TGPO: Tree-Guided Preference Optimization for Robust Web Agent Reinforcement Learning
Ziyuan Chen
Zhenghui Zhao
Zhangye Han
Miancan Liu
Xianhang Ye
Yiqing Li
Hongbo Min
Jinkui Ren
Xiantao Zhang
Guitao Cao
OffRL
246
0
0
17 Sep 2025
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents
Jiawei Wang
Jiacai Liu
Y. Fu
Y. Li
Xintao Wang
Yuan Lin
Yu Yue
L. Zhang
Y. X. R. Wang
Ke Wang
204
24
0
11 Sep 2025
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
Xuan-Phi Nguyen
Shrey Pandit
R. Reddy
Austin Xu
Silvio Savarese
Caiming Xiong
Shafiq Joty
LLMAG
LRM
219
22
0
08 Sep 2025
Symbolic Graphics Programming with Large Language Models
Yamei Chen
H. Zhang
Yangyi Huang
Zeju Qiu
Kaipeng Zhang
Yandong Wen
Weiyang Liu
228
3
0
05 Sep 2025
EviNote-RAG: Enhancing RAG Models via Answer-Supportive Evidence Notes
Yuqin Dai
Guoqing Wang
Yuan Wang
Kairan Dou
Kaichen Zhou
...
Can Yi
Changhua Meng
Yuchen Zhou
Yongliang Shen
Shuai Lu
RALM
344
6
0
31 Aug 2025
UItron: Foundational GUI Agent with Advanced Perception and Planning
Zhixiong Zeng
Jing Huang
Liming Zheng
Wenkang Han
Yufeng Zhong
Lei Chen
Longrong Yang
Yingjie Chu
Yuzhi He
Lin Ma
LLMAG
242
14
0
29 Aug 2025
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Sikuan Yan
Xiufeng Yang
Zuchao Huang
Ercong Nie
Zifeng Ding
...
Volker Tresp
Yunpu Ma
Volker Tresp
Yunpu Ma
Yunpu Ma
LLMAG
KELM
296
79
0
27 Aug 2025
Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward
Yong Deng
Guoqing Wang
ZhenZhe Ying
Xiaofeng Wu
Jinzhen Lin
...
Yang Qin
Yuan Wang
Quanxing Zha
Sunhao Dai
Changhua Meng
LRM
318
20
0
18 Aug 2025
Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning
Yuqin Dai
Shuo Yang
Guoqing Wang
Yong Deng
Zhanwei Zhang
...
Changhua Meng
Can Yi
Yuchen Zhou
Weiqiang Wang
Shuai Lu
RALM
KELM
219
4
0
11 Aug 2025
One Token to Fool LLM-as-a-Judge
Yulai Zhao
Haolin Liu
Dian Yu
Sunyuan Kung
Meijia Chen
Haitao Mi
Dong Yu
OffRL
LRM
316
42
0
11 Jul 2025
DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning
Wenxuan Shi
Haochen Tan
Chuqiao Kuang
Xiaoguang Li
Xiaozhe Ren
Chen Zhang
Hanting Chen
Yasheng Wang
Lifeng Shang
Fisher Yu
RALM
214
18
0
30 May 2025
WebDancer: Towards Autonomous Information Seeking Agency
Jialong Wu
Baixuan Li
Runnan Fang
Wenbiao Yin
Liwen Zhang
...
Yong Jiang
Pengjun Xie
Fei Huang
Jingren Zhou
Jingren Zhou
400
121
0
28 May 2025
WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback
Minda Hu
Tianqing Fang
Jianshu Zhang
Junyu Ma
Zhisong Zhang
Jingyan Zhou
Hongming Zhang
Haitao Mi
Dong Yu
Irwin King
LLMAG
LRM
494
9
0
26 May 2025
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
Zihan Wang
Kaidi Wang
Q. Wang
Pingyue Zhang
Linjie Li
...
Jiajun Wu
L. Fei-Fei
Lijuan Wang
Yejin Choi
Pengfei Yu
845
190
0
24 Apr 2025
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng
Yuzhen Huang
Qian Liu
Wei Liu
Keqing He
Zejun Ma
Junxian He
OffRL
ReLM
LRM
742
444
0
24 Mar 2025
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Sara Szymkuć
Hansi Zeng
Zhenrui Yue
Jinsung Yoon
Sercan O. Arik
Dong Wang
Hamed Zamani
Jiawei Han
OffRL
AI4TS
LRM
RALM
ReLM
KELM
996
867
0
12 Mar 2025
Language Models can Self-Improve at State-Value Estimation for Better Search
Ethan Mendes
Alan Ritter
LRM
558
4
0
04 Mar 2025
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Zehan Qi
Xiao-Chang Liu
Iat Long Iong
Hanyu Lai
Xingwu Sun
...
Shuntian Yao
Tianjie Zhang
Wei Xu
J. Tang
Yuxiao Dong
673
152
0
28 Jan 2025
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
International Conference on Learning Representations (ICLR), 2024
Hyungjoo Chae
Namyoung Kim
Kai Tzu-iunn Ong
Minju Gwak
Gwanwoo Song
Jihoon Kim
Seon Gyeom Kim
Dongha Lee
Jinyoung Yeo
LLMAG
513
82
0
17 Oct 2024
Large Language Models for Information Retrieval: A Survey
Yutao Zhu
Huaying Yuan
Shuting Wang
Jiongnan Liu
Wenhan Liu
Chenlong Deng
Haonan Chen
Zheng Liu
Zhicheng Dou
Ji-Rong Wen
KELM
781
525
0
14 Aug 2023
1
Page 1 of 1