Optimization Issues in KL-Constrained Approximate Policy Iteration

11 February 2021

Papers citing "Optimization Issues in KL-Constrained Approximate Policy Iteration"

7 / 7 papers shown

Title
Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning Haoxuan Pan Deheng Ye Xiaoming Duan Qiang Fu Wei Yang Jianping He Mingfei Sun OffRL 23 2 0 20 Jan 2023
Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks Anton Dereventsov Andrew Starnes Clayton Webster 18 4 0 21 Nov 2022
Simulated Contextual Bandits for Personalization Tasks from Recommendation Datasets Anton Dereventsov A. Bibin 18 1 0 12 Oct 2022
Learning to Constrain Policy Optimization with Virtual Trust Region Hung Le Thommen Karimpanal George Majid Abdolshah D. Nguyen Kien Do Sunil R. Gupta Svetha Venkatesh 16 3 0 20 Apr 2022
Understanding the Effect of Stochasticity in Policy Optimization Jincheng Mei Bo Dai Chenjun Xiao Csaba Szepesvári Dale Schuurmans 11 17 0 29 Oct 2021
A general class of surrogate functions for stable and efficient reinforcement learning Sharan Vaswani Olivier Bachem Simone Totaro Robert Mueller Shivam Garg M. Geist Marlos C. Machado P. S. Castro Nicolas Le Roux OffRL 24 15 0 12 Aug 2021
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement Samuel Neumann Sungsu Lim A. Joseph Yangchen Pan Adam White Martha White 14 7 0 22 Oct 2018