Learning Continuous Control Policies by Stochastic Value Gradients

30 October 2015

David Silver

Papers citing "Learning Continuous Control Policies by Stochastic Value Gradients"

37 / 337 papers shown

Value Prediction Network

Junhyuk Oh

Satinder Singh

Honglak Lee

268

343

11 Jul 2017

Emergence of Locomotion Behaviours in Rich Environments

...

Martin Riedmiller

David Silver

483

979

07 Jul 2017

Expected Policy Gradients

K. Ciosek

Shimon Whiteson

380

15 Jun 2017

Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known Dynamics

04 Jun 2017

Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2017

175

172

01 Jun 2017

Non-Markovian Control with Gated End-to-End Memory Policy Networks

J. Perez

T. Silander

OffRL

114

31 May 2017

Guide Actor-Critic for Continuous Control

Voot Tangkaratt

A. Abdolmaleki

Masashi Sugiyama

112

22 May 2017

Metacontrol for Adaptive Imagination-Based Optimization

152

07 May 2017

Data-efficient Deep Reinforcement Learning for Dexterous Manipulation

Martin Riedmiller

281

274

10 Apr 2017

Stochastic Neural Networks for Hierarchical Reinforcement Learning

Carlos Florensa

Yan Duan

Pieter Abbeel

BDL

273

372

10 Apr 2017

One-Shot Imitation Learning

Pieter Abbeel

316

721

21 Mar 2017

Sensor Fusion for Robot Control through Deep Reinforcement Learning

Pieter Simoens

113

13 Mar 2017

Prediction and Control with Temporal Segment Models

Nikhil Mishra

Pieter Abbeel

Igor Mordatch

BDL

124

12 Mar 2017

Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning

321

163

08 Mar 2017

Towards Generalization and Simplicity in Continuous Control

Aravind Rajeswaran

245

287

08 Mar 2017

Understanding Synthetic Gradients and Decoupled Neural Interfaces

Wojciech M. Czarnecki

203

01 Mar 2017

Trainable Greedy Decoding for Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2017

Jiatao Gu

Dong Wang

Victor O.K. Li

248

08 Feb 2017

Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning

178

144

30 Jan 2017

Model-based Adversarial Imitation Learning

164

07 Dec 2016

^2

: Fast Reinforcement Learning via Slow Reinforcement Learning

Pieter Abbeel

303

1,098

09 Nov 2016

Reparameterization trick for discrete variables

Seiya Tokui

Issei Sato

04 Nov 2016

Sample Efficient Actor-Critic with Experience Replay

390

796

03 Nov 2016

Towards Lifelong Self-Supervision: A Deep Learning Direction for Robotics

J. M. Wong

246

01 Nov 2016

Learning and Transfer of Modulated Locomotor Controllers

Martin Riedmiller

David Silver

178

218

17 Oct 2016

Sim-to-Real Robot Learning from Pixels with Progressive NetsConference on Robot Learning (CoRL), 2016

328

552

13 Oct 2016

Connecting Generative Adversarial Networks and Actor-Critic Methods

David Pfau

Oriol Vinyals

OffRL AI4CE

303

189

06 Oct 2016

Playing FPS Games with Deep Reinforcement Learning

Guillaume Lample

Devendra Singh Chaplot

OffRL EgoV

244

613

18 Sep 2016

Decoupled Neural Interfaces using Synthetic GradientsInternational Conference on Machine Learning (ICML), 2016

Max Jaderberg

Wojciech M. Czarnecki

David Silver

289

385

18 Aug 2016

Actor-critic versus direct policy search: a comparison based on sample complexity

Arnaud de Froissard de Broissia

Olivier Sigaud

121

29 Jun 2016

Review of state-of-the-arts in artificial intelligence with application to AI safety problem

V. Shakirov

149

11 May 2016

Benchmarking Deep Reinforcement Learning for Continuous Control

Pieter Abbeel

493

1,768

22 Apr 2016

Continuous Deep Q-Learning with Model-based Acceleration

250

1,047

02 Mar 2016

PLATO: Policy Learning using Adaptive Trajectory Optimization

G. Kahn

Tianhao Zhang

Sergey Levine

Pieter Abbeel

290

139

02 Mar 2016

Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms

289

07 Feb 2016

Memory-based control with recurrent neural networks

N. Heess

Jonathan J. Hunt

Timothy Lillicrap

David Silver

252

320

14 Dec 2015

Continuous control with deep reinforcement learning

Alexander Pritzel

David Silver

1.0K

14,720

09 Sep 2015

High-Dimensional Continuous Control Using Generalized Advantage EstimationInternational Conference on Learning Representations (ICLR), 2015

Pieter Abbeel

768

4,030

08 Jun 2015