Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.11706
Cited By
Supervised Policy Update for Deep Reinforcement Learning
29 May 2018
Q. Vuong
Yiming Zhang
Keith Ross
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Supervised Policy Update for Deep Reinforcement Learning"
8 / 8 papers shown
Title
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Akhil Agnihotri
R. Jain
Haipeng Luo
21
2
0
02 Feb 2023
Constrained Update Projection Approach to Safe Policy Optimization
Long Yang
Jiaming Ji
Juntao Dai
Linrui Zhang
Binbin Zhou
Pengfei Li
Yaodong Yang
Gang Pan
38
43
0
15 Sep 2022
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
Yiming Zhang
Keith Ross
OffRL
33
40
0
14 Jun 2021
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Xinyue Chen
Zijian Zhou
ziqi wang
Che Wang
Yanqiu Wu
Keith Ross
OffRL
19
120
0
27 Oct 2019
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
H. F. Song
A. Abdolmaleki
Jost Tobias Springenberg
Aidan Clark
Hubert Soyer
...
Dhruva Tirumala
N. Heess
Dan Belov
Martin Riedmiller
M. Botvinick
29
121
0
26 Sep 2019
Constrained Policy Improvement for Safe and Efficient Reinforcement Learning
Elad Sarafian
Aviv Tamar
Sarit Kraus
OffRL
29
11
0
20 May 2018
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
271
5,329
0
05 Nov 2016
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
163
220
0
22 May 2012
1