Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2505.16421
Cited By

WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning

v1v2 (latest)

WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning

22 May 2025

ArXiv (abs)PDF HTML HuggingFace (19 upvotes)Github

Papers citing "WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning"

49 / 49 papers shown

GTM: Simulating the World of Tools for AI Agents

GTM: Simulating the World of Tools for AI Agents

263

3

0

04 Dec 2025

Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

Zhuosheng Zhang

120

0

0

27 Nov 2025

OpenApps: Simulating Environment Variations to Measure UI-Agent Reliability

OpenApps: Simulating Environment Variations to Measure UI-Agent Reliability

Arjun Subramonian

Nikolaos Tsilivis

Randall Balestriero

213

1

0

25 Nov 2025

Stabilizing Off-Policy Training for Long-Horizon LLM Agent via Turn-Level Importance Sampling and Clipping-Triggered Normalization

Stabilizing Off-Policy Training for Long-Horizon LLM Agent via Turn-Level Importance Sampling and Clipping-Triggered Normalization

Alfredo García

Parminder Bhatia

Taha A. Kass-Hout

Mingyi Hong

268

1

0

25 Nov 2025

WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

789

5

0

17 Nov 2025

SynthAgent: Adapting Web Agents with Synthetic Supervision

SynthAgent: Adapting Web Agents with Synthetic Supervision

...

Saravan Rajmohan

184

4

0

08 Nov 2025

Scaling Agent Learning via Experience Synthesis

Scaling Agent Learning via Experience Synthesis

...

568

10

0

05 Nov 2025

Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation

Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation

Ramraj Chandradevan

...

298

6

0

04 Nov 2025

Optimizing Retrieval for RAG via Reinforcement Learning

Optimizing Retrieval for RAG via Reinforcement Learning

205

1

0

28 Oct 2025

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

343

4

0

23 Oct 2025

Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains

Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains

Xuan-Phi Nguyen

OffRL ALM LRM ELM

278

3

0

20 Oct 2025

A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications

A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications

Charu C. Aggarwal

636

9

0

19 Oct 2025

WEBSERV: A Browser-Server Environment for Efficient Training of Reinforcement Learning-based Web Agents at Scale

WEBSERV: A Browser-Server Environment for Efficient Training of Reinforcement Learning-based Web Agents at Scale

141

3

0

17 Oct 2025

Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

...

168

2

0

16 Oct 2025

Towards Agentic Self-Learning LLMs in Search Environment

Towards Agentic Self-Learning LLMs in Search Environment

211

4

0

16 Oct 2025

DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping

DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping

175

1

0

14 Oct 2025

A Survey on Agentic Multimodal Large Language Models

A Survey on Agentic Multimodal Large Language Models

...

LM&Ro AIFin AI4TS LRM AI4CE

306

12

0

13 Oct 2025

Deep Research with Open-Domain Evaluation and Multi-Stage Guardrails for Safety

Deep Research with Open-Domain Evaluation and Multi-Stage Guardrails for Safety

Wei-Chieh Huang

...

Philip S. Yu

248

2

0

13 Oct 2025

MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training

MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training

Mohsen Golalikhani

Xiangliang Zhang

Chandan K. Reddy

150

2

0

12 Oct 2025

Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

...

LLMAG LM&Ro AI4CE

231

6

0

10 Oct 2025

Dyna-Mind: Learning to Simulate from Experience for Better AI Agents

Dyna-Mind: Learning to Simulate from Experience for Better AI Agents

Janardhan Kulkarni

148

2

0

10 Oct 2025

Agent Learning via Early Experience

Agent Learning via Early Experience

...

Eric Fosler-Lussier

243

32

0

09 Oct 2025

Customer-R1: Personalized Simulation of Human Behaviors via RL-based LLM Agent in Online Shopping

Customer-R1: Personalized Simulation of Human Behaviors via RL-based LLM Agent in Online Shopping

247

7

0

08 Oct 2025

Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents

Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents

203

3

0

06 Oct 2025

JEF-Hinter: Leveraging Offline Knowledge for Improving Web Agents Adaptation

JEF-Hinter: Leveraging Offline Knowledge for Improving Web Agents Adaptation

Patrice Béchard

Orlando Marquez Ayala

Mathieu Reymond

Alexandre Drouin

Alexandre Lacoste

175

2

0

05 Oct 2025

Gradient Coupling: The Hidden Barrier to Generalization in Agentic Reinforcement Learning

Gradient Coupling: The Hidden Barrier to Generalization in Agentic Reinforcement Learning

244

0

0

28 Sep 2025

WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning

WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning

176

1

0

26 Sep 2025

Agentic Reinforcement Learning with Implicit Step Rewards

Agentic Reinforcement Learning with Implicit Step Rewards

308

0

0

23 Sep 2025

ARE: Scaling Up Agent Environments and Evaluations

ARE: Scaling Up Agent Environments and Evaluations

Amine Benhalloum

Gerard Moreno-Torres Bertran

...

Vladislav Vorotilov

528

14

0

21 Sep 2025

TGPO: Tree-Guided Preference Optimization for Robust Web Agent Reinforcement Learning

TGPO: Tree-Guided Preference Optimization for Robust Web Agent Reinforcement Learning

246

0

0

17 Sep 2025

Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents

Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents

204

24

0

11 Sep 2025

SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

Xuan-Phi Nguyen

Silvio Savarese

219

22

0

08 Sep 2025

Symbolic Graphics Programming with Large Language Models

Symbolic Graphics Programming with Large Language Models

228

3

0

05 Sep 2025

EviNote-RAG: Enhancing RAG Models via Answer-Supportive Evidence Notes

EviNote-RAG: Enhancing RAG Models via Answer-Supportive Evidence Notes

...

344

6

0

31 Aug 2025

UItron: Foundational GUI Agent with Advanced Perception and Planning

UItron: Foundational GUI Agent with Advanced Perception and Planning

242

14

0

29 Aug 2025

Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning

Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning

...

Yunpu Ma

296

79

0

27 Aug 2025

Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward

Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward

...

318

20

0

18 Aug 2025

Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning

Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning

...

219

4

0

11 Aug 2025

One Token to Fool LLM-as-a-Judge

One Token to Fool LLM-as-a-Judge

316

42

0

11 Jul 2025

DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning

DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning

214

18

0

30 May 2025

WebDancer: Towards Autonomous Information Seeking Agency

WebDancer: Towards Autonomous Information Seeking Agency

...

400

121

0

28 May 2025

WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback

WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback

494

9

0

26 May 2025

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

...

845

190

0

24 Apr 2025

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

742

444

0

24 Mar 2025

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

OffRL AI4TS LRM RALM ReLM KELM

996

867

0

12 Mar 2025

Language Models can Self-Improve at State-Value Estimation for Better Search

Language Models can Self-Improve at State-Value Estimation for Better Search

558

4

0

04 Mar 2025

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

...

673

152

0

28 Jan 2025

Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web NavigationInternational Conference on Learning Representations (ICLR), 2024

Kai Tzu-iunn Ong

513

82

0

17 Oct 2024

Large Language Models for Information Retrieval: A Survey

Large Language Models for Information Retrieval: A Survey

781

525

0

14 Aug 2023

Page 1 of 1