ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.09142
  4. Cited By
Learning Continuous Control Policies by Stochastic Value Gradients

Learning Continuous Control Policies by Stochastic Value Gradients

30 October 2015
N. Heess
Greg Wayne
David Silver
Timothy Lillicrap
Yuval Tassa
Tom Erez
ArXiv (abs)PDFHTML

Papers citing "Learning Continuous Control Policies by Stochastic Value Gradients"

37 / 337 papers shown
Value Prediction Network
Value Prediction Network
Junhyuk Oh
Satinder Singh
Honglak Lee
268
343
0
11 Jul 2017
Emergence of Locomotion Behaviours in Rich Environments
Emergence of Locomotion Behaviours in Rich Environments
N. Heess
TB Dhruva
S. Sriram
Jay Lemmon
J. Merel
...
Tom Erez
Ziyun Wang
S. M. Ali Eslami
Martin Riedmiller
David Silver
483
979
0
07 Jul 2017
Expected Policy Gradients
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
380
59
0
15 Jun 2017
Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known
  Dynamics
Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known Dynamics
Tomoki Nishi
Prashant Doshi
Michael R. James
Danil Prokhorov
52
5
0
04 Jun 2017
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient
  Estimation for Deep Reinforcement Learning
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2017
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Bernhard Schölkopf
Sergey Levine
OffRL
175
172
0
01 Jun 2017
Non-Markovian Control with Gated End-to-End Memory Policy Networks
Non-Markovian Control with Gated End-to-End Memory Policy Networks
J. Perez
T. Silander
OffRL
114
6
0
31 May 2017
Guide Actor-Critic for Continuous Control
Guide Actor-Critic for Continuous Control
Voot Tangkaratt
A. Abdolmaleki
Masashi Sugiyama
112
17
0
22 May 2017
Metacontrol for Adaptive Imagination-Based Optimization
Metacontrol for Adaptive Imagination-Based Optimization
Jessica B. Hamrick
A. J. Ballard
Razvan Pascanu
Oriol Vinyals
N. Heess
Peter W. Battaglia
152
69
0
07 May 2017
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
I. Popov
N. Heess
Timothy Lillicrap
Agrim Gupta
Gabriel Barth-Maron
Matej Vecerík
Thomas Lampe
Yuval Tassa
Tom Erez
Martin Riedmiller
OffRL
281
274
0
10 Apr 2017
Stochastic Neural Networks for Hierarchical Reinforcement Learning
Stochastic Neural Networks for Hierarchical Reinforcement Learning
Carlos Florensa
Yan Duan
Pieter Abbeel
BDL
273
372
0
10 Apr 2017
One-Shot Imitation Learning
One-Shot Imitation Learning
Yan Duan
Marcin Andrychowicz
Bradly C. Stadie
Jonathan Ho
Jonas Schneider
Ilya Sutskever
Pieter Abbeel
Wojciech Zaremba
OffRL
316
721
0
21 Mar 2017
Sensor Fusion for Robot Control through Deep Reinforcement Learning
Sensor Fusion for Robot Control through Deep Reinforcement Learning
Steven Bohez
Tim Verbelen
E. D. Coninck
B. Vankeirsbilck
Pieter Simoens
Bart Dhoedt
SSL
113
29
0
13 Mar 2017
Prediction and Control with Temporal Segment Models
Prediction and Control with Temporal Segment Models
Nikhil Mishra
Pieter Abbeel
Igor Mordatch
BDL
124
66
0
12 Mar 2017
Combining Model-Based and Model-Free Updates for Trajectory-Centric
  Reinforcement Learning
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
Yevgen Chebotar
Karol Hausman
Marvin Zhang
Gaurav Sukhatme
S. Schaal
Sergey Levine
321
163
0
08 Mar 2017
Towards Generalization and Simplicity in Continuous Control
Towards Generalization and Simplicity in Continuous Control
Aravind Rajeswaran
Kendall Lowrey
E. Todorov
Sham Kakade
OffRL
245
287
0
08 Mar 2017
Understanding Synthetic Gradients and Decoupled Neural Interfaces
Understanding Synthetic Gradients and Decoupled Neural Interfaces
Wojciech M. Czarnecki
G. Swirszcz
Max Jaderberg
Simon Osindero
Oriol Vinyals
Koray Kavukcuoglu
203
89
0
01 Mar 2017
Trainable Greedy Decoding for Neural Machine Translation
Trainable Greedy Decoding for Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2017
Jiatao Gu
Dong Wang
Victor O.K. Li
248
77
0
08 Feb 2017
Expert Level control of Ramp Metering based on Multi-task Deep
  Reinforcement Learning
Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning
Francois Belletti
Daniel Haziza
G. Gomes
Alexandre M. Bayen
178
144
0
30 Jan 2017
Model-based Adversarial Imitation Learning
Model-based Adversarial Imitation Learning
Nir Baram
Oron Anschel
Shie Mannor
GAN
164
45
0
07 Dec 2016
RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning
RL2^22: Fast Reinforcement Learning via Slow Reinforcement Learning
Yan Duan
John Schulman
Xi Chen
Peter L. Bartlett
Ilya Sutskever
Pieter Abbeel
OffRL
303
1,098
0
09 Nov 2016
Reparameterization trick for discrete variables
Reparameterization trick for discrete variables
Seiya Tokui
Issei Sato
99
11
0
04 Nov 2016
Sample Efficient Actor-Critic with Experience Replay
Sample Efficient Actor-Critic with Experience Replay
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
390
796
0
03 Nov 2016
Towards Lifelong Self-Supervision: A Deep Learning Direction for
  Robotics
Towards Lifelong Self-Supervision: A Deep Learning Direction for Robotics
J. M. Wong
246
12
0
01 Nov 2016
Learning and Transfer of Modulated Locomotor Controllers
Learning and Transfer of Modulated Locomotor Controllers
N. Heess
Greg Wayne
Yuval Tassa
Timothy Lillicrap
Martin Riedmiller
David Silver
178
218
0
17 Oct 2016
Sim-to-Real Robot Learning from Pixels with Progressive Nets
Sim-to-Real Robot Learning from Pixels with Progressive NetsConference on Robot Learning (CoRL), 2016
Andrei A. Rusu
Matej Vecerík
Thomas Rothörl
N. Heess
Razvan Pascanu
R. Hadsell
328
552
0
13 Oct 2016
Connecting Generative Adversarial Networks and Actor-Critic Methods
Connecting Generative Adversarial Networks and Actor-Critic Methods
David Pfau
Oriol Vinyals
OffRLAI4CE
303
189
0
06 Oct 2016
Playing FPS Games with Deep Reinforcement Learning
Playing FPS Games with Deep Reinforcement Learning
Guillaume Lample
Devendra Singh Chaplot
OffRLEgoV
244
613
0
18 Sep 2016
Decoupled Neural Interfaces using Synthetic Gradients
Decoupled Neural Interfaces using Synthetic GradientsInternational Conference on Machine Learning (ICML), 2016
Max Jaderberg
Wojciech M. Czarnecki
Simon Osindero
Oriol Vinyals
Alex Graves
David Silver
Koray Kavukcuoglu
289
385
0
18 Aug 2016
Actor-critic versus direct policy search: a comparison based on sample
  complexity
Actor-critic versus direct policy search: a comparison based on sample complexity
Arnaud de Froissard de Broissia
Olivier Sigaud
121
13
0
29 Jun 2016
Review of state-of-the-arts in artificial intelligence with application
  to AI safety problem
Review of state-of-the-arts in artificial intelligence with application to AI safety problem
V. Shakirov
149
10
0
11 May 2016
Benchmarking Deep Reinforcement Learning for Continuous Control
Benchmarking Deep Reinforcement Learning for Continuous Control
Yan Duan
Xi Chen
Rein Houthooft
John Schulman
Pieter Abbeel
OffRL
493
1,768
0
22 Apr 2016
Continuous Deep Q-Learning with Model-based Acceleration
Continuous Deep Q-Learning with Model-based Acceleration
S. Gu
Timothy Lillicrap
Ilya Sutskever
Sergey Levine
250
1,047
0
02 Mar 2016
PLATO: Policy Learning using Adaptive Trajectory Optimization
PLATO: Policy Learning using Adaptive Trajectory Optimization
G. Kahn
Tianhao Zhang
Sergey Levine
Pieter Abbeel
290
139
0
02 Mar 2016
Ensemble Robustness and Generalization of Stochastic Deep Learning
  Algorithms
Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms
Tom Zahavy
Bingyi Kang
Alex Sivak
Jiashi Feng
Huan Xu
Shie Mannor
OODAAML
289
12
0
07 Feb 2016
Memory-based control with recurrent neural networks
Memory-based control with recurrent neural networks
N. Heess
Jonathan J. Hunt
Timothy Lillicrap
David Silver
252
320
0
14 Dec 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
1.0K
14,720
0
09 Sep 2015
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage EstimationInternational Conference on Learning Representations (ICLR), 2015
John Schulman
Philipp Moritz
Sergey Levine
Sai Li
Pieter Abbeel
OffRL
768
4,030
0
08 Jun 2015
Previous
1234567