Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.08285
Cited By
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
17 July 2021
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences"
10 / 10 papers shown
Title
Trust-Region Twisted Policy Improvement
Joery A. de Vries
Jinke He
Yaniv Oren
M. Spaan
OffRL
LRM
30
0
0
08 Apr 2025
Value Improved Actor Critic Algorithms
Yaniv Oren
Moritz A. Zanger
Pascal R. van der Vaart
M. Spaan
Wendelin Bohmer
Wendelin Bohmer
OffRL
31
0
0
03 Jun 2024
Shared learning of powertrain control policies for vehicle fleets
Lindsey Kerbel
B. Ayalew
Andrej Ivanco
27
0
0
27 Apr 2024
HesScale: Scalable Computation of Hessian Diagonals
Mohamed Elsayed
A. R. Mahmood
14
7
0
20 Oct 2022
Critic Sequential Monte Carlo
Vasileios Lioutas
J. Lavington
Justice Sefas
Matthew Niedoba
Yunpeng Liu
Berend Zwartsenberg
Setareh Dabiri
Frank D. Wood
Adam Scibior
40
7
0
30 May 2022
The Geometry of Robust Value Functions
Kaixin Wang
Navdeep Kumar
Kuangqi Zhou
Bryan Hooi
Jiashi Feng
Shie Mannor
AAML
11
7
0
30 Jan 2022
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
87
136
0
30 Jan 2021
Ensuring Monotonic Policy Improvement in Entropy-regularized Value-based Reinforcement Learning
Lingwei Zhu
Takamitsu Matsubara
17
4
0
25 Aug 2020
CAQL: Continuous Action Q-Learning
Moonkyung Ryu
Yinlam Chow
Ross Anderson
Christian Tjandraatmadja
Craig Boutilier
197
42
0
26 Sep 2019
Input Convex Neural Networks
Brandon Amos
Lei Xu
J. Zico Kolter
173
597
0
22 Sep 2016
1