Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents

16 October 2025

Papers citing "Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents"

1 / 1 papers shown

Title
The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward Long Li Jiaran Hao Jason Klein Liu Zhijian Zhou Yanting Miao ... Wei Chu Zhe Wang Shirui Pan Chao Qu Yuan Qi 147 5 0 09 Sep 2025