Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.02247
Cited By
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
7 November 2016
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Sergey Levine
OffRL
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic"
46 / 196 papers shown
Title
Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods
Deirdre Quillen
Eric Jang
Ofir Nachum
Chelsea Finn
Julian Ibarz
Sergey Levine
OOD
OffRL
35
202
0
28 Feb 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning
George Tucker
Surya Bhupatiraju
S. Gu
Richard Turner
Zoubin Ghahramani
Sergey Levine
OffRL
30
126
0
27 Feb 2018
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents
Kai Zhang
Zhuoran Yang
Han Liu
Tong Zhang
Tamer Basar
43
581
0
23 Feb 2018
Convergent Actor-Critic Algorithms Under Off-Policy Training and Function Approximation
H. Maei
OffRL
10
32
0
21 Feb 2018
Clipped Action Policy Gradient
Yasuhiro Fujita
S. Maeda
OffRL
34
37
0
21 Feb 2018
Fourier Policy Gradients
M. Fellows
K. Ciosek
Shimon Whiteson
35
15
0
19 Feb 2018
Accelerated Primal-Dual Policy Optimization for Safe Reinforcement Learning
Qingkai Liang
Fanyu Que
E. Modiano
29
101
0
19 Feb 2018
Reinforcement Learning from Imperfect Demonstrations
Yang Gao
Huazhe Xu
Ji Lin
Feng Yu
Sergey Levine
Trevor Darrell
29
200
0
14 Feb 2018
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
Cédric Colas
Olivier Sigaud
Pierre-Yves Oudeyer
29
157
0
14 Feb 2018
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search
Yelong Shen
Jianshu Chen
Po-Sen Huang
Yuqing Guo
Jianfeng Gao
29
127
0
12 Feb 2018
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations
Xiaoqin Zhang
Huimin Ma
OffRL
43
38
0
31 Jan 2018
Experience-driven Networking: A Deep Reinforcement Learning based Approach
Zhiyuan Xu
Jian Tang
Jingsong Meng
Weiyi Zhang
Yanzhi Wang
C. Liu
Dejun Yang
OffRL
35
359
0
17 Jan 2018
Expected Policy Gradients for Reinforcement Learning
K. Ciosek
Shimon Whiteson
50
51
0
10 Jan 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
52
8,170
0
04 Jan 2018
RLlib: Abstractions for Distributed Reinforcement Learning
Eric Liang
Richard Liaw
Philipp Moritz
Robert Nishihara
Roy Fox
Ken Goldberg
Joseph E. Gonzalez
Michael I. Jordan
Ion Stoica
OffRL
AI4CE
31
173
0
26 Dec 2017
Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator
Stephen Tu
Benjamin Recht
OffRL
37
130
0
22 Dec 2017
Action Branching Architectures for Deep Reinforcement Learning
Arash Tavakoli
Fabio Pardo
Petar Kormushev
22
260
0
24 Nov 2017
How Generative Adversarial Networks and Their Variants Work: An Overview
Yongjun Hong
Uiwon Hwang
Jaeyoon Yoo
Sungroh Yoon
GAN
41
153
0
16 Nov 2017
Costate-focused models for reinforcement learning
B. Behrouzi
Xuefei Liu
D. Tweed
OffRL
23
0
0
15 Nov 2017
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
56
300
0
31 Oct 2017
Action-depedent Control Variates for Policy Optimization via Stein's Identity
Hao Liu
Yihao Feng
Yi Mao
Dengyong Zhou
Jian-wei Peng
Qiang Liu
35
4
0
30 Oct 2017
On- and Off-Policy Monotonic Policy Improvement
R. Iwaki
Minoru Asada
OffRL
22
0
0
10 Oct 2017
Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning
Maximilian Hüttenrauch
Adrian Šošić
Gerhard Neumann
11
3
0
21 Sep 2017
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
74
1,932
0
19 Sep 2017
Mean Actor Critic
Cameron Allen
Kavosh Asadi
Melrose Roderick
Abdel-rahman Mohamed
George Konidaris
Michael Littman
36
44
0
01 Sep 2017
A Brief Survey of Deep Reinforcement Learning
Kai Arulkumaran
M. Deisenroth
Miles Brundage
Anil Anthony Bharath
OffRL
65
2,780
0
19 Aug 2017
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
22
622
0
17 Aug 2017
Benchmark Environments for Multitask Learning in Continuous Domains
Peter Henderson
Wei-Di Chang
Florian Shkurti
Johanna Hansen
David Meger
Gregory Dudek
14
40
0
14 Aug 2017
Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control
Riashat Islam
Peter Henderson
Maziar Gomrokchi
Doina Precup
BDL
OffRL
19
251
0
10 Aug 2017
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
Anusha Nagabandi
G. Kahn
R. Fearing
Sergey Levine
46
965
0
08 Aug 2017
Deep Reinforcement Learning Attention Selection for Person Re-Identification
Xu Lan
Hanxiao Wang
S. Gong
Xiatian Zhu
26
6
0
10 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
19
106
0
06 Jul 2017
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
33
57
0
15 Jun 2017
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Bernhard Schölkopf
Sergey Levine
OffRL
35
164
0
01 Jun 2017
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
61
1,302
0
30 May 2017
Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation
Guan-Horng Liu
Avinash Siravuru
Sai P. Selvaraj
Manuela Veloso
George Kantor
16
69
0
30 May 2017
Enhanced Experience Replay Generation for Efficient Reinforcement Learning
Vincent Huang
Tobias Ley
Martha Vlachou-Konchylaki
Wenfeng Hu
OnRL
GAN
SyDa
24
9
0
23 May 2017
Stein Variational Policy Gradient
Yang Liu
Prajit Ramachandran
Qiang Liu
Jian-wei Peng
22
138
0
07 Apr 2017
Learning Combinatorial Optimization Algorithms over Graphs
H. Dai
Elias Boutros Khalil
Yuyu Zhang
B. Dilkina
Le Song
46
1,446
0
05 Apr 2017
Learning to Navigate Cloth using Haptics
Alexander Clegg
Wenhao Yu
Zackory M. Erickson
Jie Tan
Chenxi Liu
Greg Turk
21
23
0
20 Mar 2017
Towards Generalization and Simplicity in Continuous Control
Aravind Rajeswaran
Kendall Lowrey
E. Todorov
Sham Kakade
OffRL
55
276
0
08 Mar 2017
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
35
466
0
28 Feb 2017
Reinforcement Learning with Deep Energy-Based Policies
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
26
1,316
0
27 Feb 2017
Preparing for the Unknown: Learning a Universal Policy with Online System Identification
Wenhao Yu
Jie Tan
Chenxi Liu
Greg Turk
OffRL
37
306
0
08 Feb 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
106
1,505
0
25 Jan 2017
Input Convex Neural Networks
Brandon Amos
Lei Xu
J. Zico Kolter
187
603
0
22 Sep 2016
Previous
1
2
3
4