ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.03817
  4. Cited By
From Novice to Expert: LLM Agent Policy Optimization via Step-wise
  Reinforcement Learning

From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning

6 November 2024
Zhirui Deng
Zhicheng Dou
Y. X. Zhu
Ji-Rong Wen
Ruibin Xiong
Mang Wang
Weipeng Chen
ArXivPDFHTML

Papers citing "From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning"

2 / 2 papers shown
Title
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Pengxiang Li
Zhi Gao
Bofei Zhang
Yapeng Mi
Xiaojian Ma
...
Tao Yuan
Yuwei Wu
Yunde Jia
Song-Chun Zhu
Qing Li
LLMAG
70
0
0
30 Apr 2025
Exploring Expert Failures Improves LLM Agent Tuning
Exploring Expert Failures Improves LLM Agent Tuning
Li-Cheng Lan
Andrew Bai
Minhao Cheng
Ruochen Wang
Cho-Jui Hsieh
LRM
110
0
0
17 Apr 2025
1