ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.09142
  4. Cited By
Learning Continuous Control Policies by Stochastic Value Gradients

Learning Continuous Control Policies by Stochastic Value Gradients

30 October 2015
N. Heess
Greg Wayne
David Silver
Timothy Lillicrap
Yuval Tassa
Tom Erez
ArXiv (abs)PDFHTML

Papers citing "Learning Continuous Control Policies by Stochastic Value Gradients"

50 / 337 papers shown
Reinforcement Learning in Robotic Motion Planning by Combined
  Experience-based Planning and Self-Imitation Learning
Reinforcement Learning in Robotic Motion Planning by Combined Experience-based Planning and Self-Imitation Learning
Sha Luo
Lambert Schomaker
233
15
0
11 Jun 2023
PACER: A Fully Push-forward-based Distributional Reinforcement Learning
  Algorithm
PACER: A Fully Push-forward-based Distributional Reinforcement Learning AlgorithmNeurocomputing (Neurocomputing), 2023
Wensong Bai
Chao Zhang
Yichao Fu
Lingwei Peng
Hui Qian
Bin Dai
190
1
0
11 Jun 2023
Self-Supervised Reinforcement Learning that Transfers using Random
  Features
Self-Supervised Reinforcement Learning that Transfers using Random FeaturesNeural Information Processing Systems (NeurIPS), 2023
Boyuan Chen
Chuning Zhu
Pulkit Agrawal
Jianchao Tan
Abhishek Gupta
OffRLSSL
247
12
0
26 May 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical
  Guarantees
Decision-Aware Actor-Critic with Function Approximation and Theoretical GuaranteesNeural Information Processing Systems (NeurIPS), 2023
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
322
5
0
24 May 2023
A Generalist Dynamics Model for Control
A Generalist Dynamics Model for Control
Ingmar Schubert
Jingwei Zhang
Jake Bruce
Sarah Bechtle
Emilio Parisotto
Martin Riedmiller
Jost Tobias Springenberg
Arunkumar Byravan
Leonard Hasenclever
N. Heess
AI4CE
205
35
0
18 May 2023
Safe MDP Planning by Learning Temporal Patterns of Undesirable
  Trajectories and Averting Negative Side Effects
Safe MDP Planning by Learning Temporal Patterns of Undesirable Trajectories and Averting Negative Side EffectsInternational Conference on Automated Planning and Scheduling (ICAPS), 2023
Siow Meng Low
Akshat Kumar
Scott Sanner
127
2
0
06 Apr 2023
Diminishing Return of Value Expansion Methods in Model-Based
  Reinforcement Learning
Diminishing Return of Value Expansion Methods in Model-Based Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023
Daniel Palenicek
M. Lutter
João Carvalho
Jan Peters
182
4
0
07 Mar 2023
Taylor TD-learning
Taylor TD-learningNeural Information Processing Systems (NeurIPS), 2023
Michele Garibbo
Maxime Robeyns
Laurence Aitchison
OffRL
238
2
0
27 Feb 2023
Leveraging Jumpy Models for Planning and Fast Learning in Robotic
  Domains
Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains
Jingwei Zhang
Jost Tobias Springenberg
Arunkumar Byravan
Leonard Hasenclever
A. Abdolmaleki
Dushyant Rao
N. Heess
Martin Riedmiller
157
5
0
24 Feb 2023
Stochastic Generative Flow Networks
Stochastic Generative Flow NetworksConference on Uncertainty in Artificial Intelligence (UAI), 2023
L. Pan
Dinghuai Zhang
Moksh Jain
Longbo Huang
Yoshua Bengio
BDL
254
38
0
19 Feb 2023
Predictable MDP Abstraction for Unsupervised Model-Based RL
Predictable MDP Abstraction for Unsupervised Model-Based RLInternational Conference on Machine Learning (ICML), 2023
Seohong Park
Sergey Levine
214
10
0
08 Feb 2023
DiSProD: Differentiable Symbolic Propagation of Distributions for
  Planning
DiSProD: Differentiable Symbolic Propagation of Distributions for PlanningInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Palash Chatterjee
Ashutosh Chapagain
Weizhe (Wesley) Chen
Roni Khardon
271
2
0
03 Feb 2023
Extreme Q-Learning: MaxEnt RL without Entropy
Extreme Q-Learning: MaxEnt RL without EntropyInternational Conference on Learning Representations (ICLR), 2023
Divyansh Garg
Joey Hejna
Matthieu Geist
Stefano Ermon
OffRL
270
103
0
05 Jan 2023
Latent Variable Representation for Reinforcement Learning
Latent Variable Representation for Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2022
Zhaolin Ren
Chenjun Xiao
Tianjun Zhang
Na Li
Zhaoran Wang
Sujay Sanghavi
Dale Schuurmans
Bo Dai
OffRL
227
12
0
17 Dec 2022
Physics-Informed Model-Based Reinforcement Learning
Physics-Informed Model-Based Reinforcement LearningConference on Learning for Dynamics & Control (L4DC), 2022
Adithya Ramesh
Balaraman Ravindran
188
23
0
05 Dec 2022
Predictive Sampling: Real-time Behaviour Synthesis with MuJoCo
Predictive Sampling: Real-time Behaviour Synthesis with MuJoCo
Taylor A. Howell
Nimrod Gileadi
S. Tunyasuvunakool
Kevin Zakka
Tom Erez
Yuval Tassa
289
119
0
01 Dec 2022
The Benefits of Model-Based Generalization in Reinforcement Learning
The Benefits of Model-Based Generalization in Reinforcement LearningInternational Conference on Machine Learning (ICML), 2022
K. Young
Aditya A. Ramesh
Louis Kirsch
Jürgen Schmidhuber
OffRL
347
15
0
04 Nov 2022
Scalable Multi-Agent Reinforcement Learning through Intelligent
  Information Aggregation
Scalable Multi-Agent Reinforcement Learning through Intelligent Information AggregationInternational Conference on Machine Learning (ICML), 2022
Siddharth Nayak
Kenneth M. F. Choi
Wenqi Ding
Sydney I. Dolan
Karthik Gopalakrishnan
H. Balakrishnan
212
61
0
03 Nov 2022
Integrated Decision and Control for High-Level Automated Vehicles by
  Mixed Policy Gradient and Its Experiment Verification
Integrated Decision and Control for High-Level Automated Vehicles by Mixed Policy Gradient and Its Experiment Verification
Yang Guan
Liye Tang
Chuanxiao Li
Shengbo Eben Li
Yangang Ren
Junqing Wei
Bo Zhang
Ke Li
120
1
0
19 Oct 2022
Model-based Safe Deep Reinforcement Learning via a Constrained Proximal
  Policy Optimization Algorithm
Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization AlgorithmNeural Information Processing Systems (NeurIPS), 2022
Ashish Kumar Jayant
S. Bhatnagar
OffRL
165
61
0
14 Oct 2022
ControlVAE: Model-Based Learning of Generative Controllers for
  Physics-Based Characters
ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based CharactersACM Transactions on Graphics (TOG), 2022
Heyuan Yao
Zhenhua Song
Bin Chen
Libin Liu
DRLVGen
169
57
0
12 Oct 2022
Training Efficient Controllers via Analytic Policy Gradient
Training Efficient Controllers via Analytic Policy GradientIEEE International Conference on Robotics and Automation (ICRA), 2022
Nina Wiedemann
Valentin Wüest
Antonio Loquercio
M. Müller
Dario Floreano
Davide Scaramuzza
OffRL
280
25
0
26 Sep 2022
Simplifying Model-based RL: Learning Representations, Latent-space
  Models, and Policies with One Objective
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One ObjectiveInternational Conference on Learning Representations (ICLR), 2022
Raj Ghugare
Homanga Bharadhwaj
Benjamin Eysenbach
Sergey Levine
Ruslan Salakhutdinov
OffRL
336
28
0
18 Sep 2022
Conservative Dual Policy Optimization for Efficient Model-Based
  Reinforcement Learning
Conservative Dual Policy Optimization for Efficient Model-Based Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Shen Zhang
161
6
0
16 Sep 2022
A model-based approach to meta-Reinforcement Learning: Transformers and
  tree search
A model-based approach to meta-Reinforcement Learning: Transformers and tree searchThe European Symposium on Artificial Neural Networks (ESANN), 2022
Brieuc Pinon
Jean-Charles Delvenne
Raphaël Jungers
OffRL
220
4
0
24 Aug 2022
Entropy Enhanced Multi-Agent Coordination Based on Hierarchical Graph
  Learning for Continuous Action Space
Entropy Enhanced Multi-Agent Coordination Based on Hierarchical Graph Learning for Continuous Action SpaceIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2022
Yining Chen
Ke Wang
Guang-hua Song
Xiaohong Jiang
132
3
0
23 Aug 2022
Efficient Planning in a Compact Latent Action Space
Efficient Planning in a Compact Latent Action SpaceInternational Conference on Learning Representations (ICLR), 2022
Zhengyao Jiang
Tianjun Zhang
Michael Janner
Yueying Li
Tim Rocktaschel
Edward Grefenstette
Yuandong Tian
OffRL
289
54
0
22 Aug 2022
MPC-based Imitation Learning for Safe and Human-like Autonomous Driving
MPC-based Imitation Learning for Safe and Human-like Autonomous Driving
F. S. Acerbo
Jan Swevers
Tinne Tuytelaars
Tong Duy Son
65
2
0
24 Jun 2022
Auto-Encoding Adversarial Imitation Learning
Auto-Encoding Adversarial Imitation Learning
Kaifeng Zhang
Rui Zhao
Ziming Zhang
Yang Gao
223
1
0
22 Jun 2022
A Survey on Model-based Reinforcement Learning
A Survey on Model-based Reinforcement LearningScience China Information Sciences (Sci. China Inf. Sci.), 2022
Fan Luo
Tian Xu
Hang Lai
Xiong-Hui Chen
Weinan Zhang
Yang Yu
OffRLLRM
346
152
0
19 Jun 2022
Autonomous Platoon Control with Integrated Deep Reinforcement Learning
  and Dynamic Programming
Autonomous Platoon Control with Integrated Deep Reinforcement Learning and Dynamic ProgrammingIEEE Internet of Things Journal (IEEE IoT J.), 2022
Tong Liu
Lei Lei
Kan Zheng
Kuan Zhang
298
36
0
15 Jun 2022
Open-Ended Learning Strategies for Learning Complex Locomotion Skills
Open-Ended Learning Strategies for Learning Complex Locomotion Skills
Fangqin Zhou
Joaquin Vanschoren
237
2
0
14 Jun 2022
Reinforcement Learning for Vision-based Object Manipulation with
  Non-parametric Policy and Action Primitives
Reinforcement Learning for Vision-based Object Manipulation with Non-parametric Policy and Action PrimitivesIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2021
Dongwon Son
Myungsin Kim
Jaecheol Sim
Wonsik Shin
167
1
0
12 Jun 2022
A Meta Reinforcement Learning Approach for Predictive Autoscaling in the
  Cloud
A Meta Reinforcement Learning Approach for Predictive Autoscaling in the CloudKnowledge Discovery and Data Mining (KDD), 2022
Siqiao Xue
Chao Qu
Xiaoming Shi
Cong Liao
Shiyi Zhu
...
Yun Hu
Lei Lei
Yang Zheng
Jianguo Li
James Y. Zhang
224
55
0
31 May 2022
Memory-efficient Reinforcement Learning with Value-based Knowledge
  Consolidation
Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation
Qingfeng Lan
Yangchen Pan
Jun Luo
A. R. Mahmood
OffRL
499
9
0
22 May 2022
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent
  Reinforcement Learning
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Zhiwei Xu
Dapeng Li
Bin Zhang
Yuan Zhan
Yunru Bai
Guoliang Fan
OffRL
274
11
0
20 Apr 2022
Accelerated Policy Learning with Parallel Differentiable Simulation
Accelerated Policy Learning with Parallel Differentiable SimulationInternational Conference on Learning Representations (ICLR), 2022
Jie Xu
Viktor Makoviychuk
Yashraj S. Narang
Fabio Ramos
Wojciech Matusik
Animesh Garg
Lukasz Wawrzyniak
248
126
0
14 Apr 2022
Revisiting Model-based Value Expansion
Revisiting Model-based Value Expansion
Daniel Palenicek
M. Lutter
Jan Peters
191
2
0
28 Mar 2022
Investigating Compounding Prediction Errors in Learned Dynamics Models
Investigating Compounding Prediction Errors in Learned Dynamics Models
Nathan Lambert
K. Pister
Roberto Calandra
AI4CE
222
39
0
17 Mar 2022
Strategic Maneuver and Disruption with Reinforcement Learning Approaches
  for Multi-Agent Coordination
Strategic Maneuver and Disruption with Reinforcement Learning Approaches for Multi-Agent CoordinationThe Journal of Defence Modeling and Simulation: Applications, Methodology, Technology (JDMS), 2022
Derrik E. Asher
Anjon Basak
Rolando Fernandez
P. Sharma
Erin G. Zaroukian
...
Thomas Mahre
Gerardo Galindo
Luke Frerichs
J. Rogers
J. Fossaceca
AI4CE
165
5
0
17 Mar 2022
Retrieval-Augmented Reinforcement Learning
Retrieval-Augmented Reinforcement LearningInternational Conference on Machine Learning (ICML), 2022
Anirudh Goyal
A. Friesen
Andrea Banino
T. Weber
Nan Rosemary Ke
...
Michal Valko
Simon Osindero
Timothy Lillicrap
N. Heess
Charles Blundell
OffRL
406
66
0
17 Feb 2022
GrASP: Gradient-Based Affordance Selection for Planning
GrASP: Gradient-Based Affordance Selection for Planning
Vivek Veeriah
Zeyu Zheng
Richard L. Lewis
Satinder Singh
178
4
0
08 Feb 2022
A Temporal-Difference Approach to Policy Gradient Estimation
A Temporal-Difference Approach to Policy Gradient EstimationInternational Conference on Machine Learning (ICML), 2022
Samuele Tosatto
Andrew Patterson
Martha White
A. R. Mahmood
OffRL
405
2
0
04 Feb 2022
Tutorial on amortized optimization
Tutorial on amortized optimization
Brandon Amos
OffRL
812
77
0
01 Feb 2022
Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal
  Point Processes
Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal Point ProcessesAAAI Conference on Artificial Intelligence (AAAI), 2022
Chao Qu
Jue Chen
Siqiao Xue
Xiaoming Shi
James Y. Zhang
Hongyuan Mei
OffRL
242
22
0
29 Jan 2022
Joint Differentiable Optimization and Verification for Certified
  Reinforcement Learning
Joint Differentiable Optimization and Verification for Certified Reinforcement LearningInternational Conference on Cyber-Physical Systems (ICCPS), 2022
Yixuan Wang
S. Zhan
Zhilu Wang
Chao Huang
Zhaoran Wang
Zhuoran Yang
Qi Zhu
188
24
0
28 Jan 2022
Reinforcement Learning for Personalized Drug Discovery and Design for
  Complex Diseases: A Systems Pharmacology Perspective
Reinforcement Learning for Personalized Drug Discovery and Design for Complex Diseases: A Systems Pharmacology Perspective
Ryan K. Tan
Yang Liu
Lei Xie
293
2
0
21 Jan 2022
Sample-Efficient Reinforcement Learning via Conservative Model-Based
  Actor-Critic
Sample-Efficient Reinforcement Learning via Conservative Model-Based Actor-Critic
Zhihai Wang
Jie Wang
Qi Zhou
Bin Li
Houqiang Li
192
37
0
16 Dec 2021
Wish you were here: Hindsight Goal Selection for long-horizon dexterous
  manipulation
Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation
Todor Davchev
Oleg O. Sushkov
Jean-Baptiste Regli
S. Schaal
Y. Aytar
Markus Wulfmeier
Jonathan Scholz
220
19
0
01 Dec 2021
Generalized Decision Transformer for Offline Hindsight Information
  Matching
Generalized Decision Transformer for Offline Hindsight Information MatchingInternational Conference on Learning Representations (ICLR), 2021
Hiroki Furuta
Y. Matsuo
S. Gu
OffRL
261
118
0
19 Nov 2021
Previous
1234567
Next