ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.06234
  4. Cited By
Optimization Issues in KL-Constrained Approximate Policy Iteration

Optimization Issues in KL-Constrained Approximate Policy Iteration

11 February 2021
N. Lazić
Botao Hao
Yasin Abbasi-Yadkori
Dale Schuurmans
Csaba Szepesvári
ArXivPDFHTML

Papers citing "Optimization Issues in KL-Constrained Approximate Policy Iteration"

7 / 7 papers shown
Title
Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement
  Learning
Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning
Haoxuan Pan
Deheng Ye
Xiaoming Duan
Qiang Fu
Wei Yang
Jianping He
Mingfei Sun
OffRL
23
2
0
20 Jan 2023
Examining Policy Entropy of Reinforcement Learning Agents for
  Personalization Tasks
Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks
Anton Dereventsov
Andrew Starnes
Clayton Webster
18
4
0
21 Nov 2022
Simulated Contextual Bandits for Personalization Tasks from
  Recommendation Datasets
Simulated Contextual Bandits for Personalization Tasks from Recommendation Datasets
Anton Dereventsov
A. Bibin
18
1
0
12 Oct 2022
Learning to Constrain Policy Optimization with Virtual Trust Region
Learning to Constrain Policy Optimization with Virtual Trust Region
Hung Le
Thommen Karimpanal George
Majid Abdolshah
D. Nguyen
Kien Do
Sunil R. Gupta
Svetha Venkatesh
16
3
0
20 Apr 2022
Understanding the Effect of Stochasticity in Policy Optimization
Understanding the Effect of Stochasticity in Policy Optimization
Jincheng Mei
Bo Dai
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
11
17
0
29 Oct 2021
A general class of surrogate functions for stable and efficient
  reinforcement learning
A general class of surrogate functions for stable and efficient reinforcement learning
Sharan Vaswani
Olivier Bachem
Simone Totaro
Robert Mueller
Shivam Garg
M. Geist
Marlos C. Machado
P. S. Castro
Nicolas Le Roux
OffRL
24
15
0
12 Aug 2021
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy
  Improvement
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
Samuel Neumann
Sungsu Lim
A. Joseph
Yangchen Pan
Adam White
Martha White
14
7
0
22 Oct 2018
1