ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXivPDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 3,103 papers shown
Title
Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for
  Continuous Control
Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control
Riashat Islam
Peter Henderson
Maziar Gomrokchi
Doina Precup
BDL
OffRL
21
251
0
10 Aug 2017
A Machine Learning Approach to Routing
A Machine Learning Approach to Routing
Asaf Valadarsky
Michael Schapira
Dafna Shahaf
Aviv Tamar
28
38
0
10 Aug 2017
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with
  Model-Free Fine-Tuning
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
Anusha Nagabandi
G. Kahn
R. Fearing
Sergey Levine
46
966
0
08 Aug 2017
GPLAC: Generalizing Vision-Based Robotic Skills using Weakly Labeled
  Images
GPLAC: Generalizing Vision-Based Robotic Skills using Weakly Labeled Images
Avi Singh
Larry Yang
Sergey Levine
22
23
0
07 Aug 2017
An Information-Theoretic Optimality Principle for Deep Reinforcement
  Learning
An Information-Theoretic Optimality Principle for Deep Reinforcement Learning
Felix Leibfried
Jordi Grau-Moya
Haitham Bou-Ammar
51
24
0
06 Aug 2017
CASSL: Curriculum Accelerated Self-Supervised Learning
CASSL: Curriculum Accelerated Self-Supervised Learning
Adithyavairavan Murali
Lerrel Pinto
Dhiraj Gandhi
Abhinav Gupta
SSL
27
35
0
04 Aug 2017
Meta-SGD: Learning to Learn Quickly for Few-Shot Learning
Meta-SGD: Learning to Learn Quickly for Few-Shot Learning
Zhenguo Li
Fengwei Zhou
Fei Chen
Hang Li
23
1,115
0
31 Jul 2017
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics
  Problems with Sparse Rewards
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
Matej Vecerík
Todd Hester
Jonathan Scholz
Fumin Wang
Olivier Pietquin
Bilal Piot
N. Heess
Thomas Rothörl
Thomas Lampe
Martin Riedmiller
OffRL
38
659
0
27 Jul 2017
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
I. Higgins
Arka Pal
Andrei A. Rusu
Loic Matthey
Christopher P. Burgess
Alexander Pritzel
M. Botvinick
Charles Blundell
Alexander Lerchner
DRL
74
412
0
26 Jul 2017
Mutual Alignment Transfer Learning
Mutual Alignment Transfer Learning
Markus Wulfmeier
Ingmar Posner
Pieter Abbeel
35
61
0
25 Jul 2017
Learning Transferable Architectures for Scalable Image Recognition
Learning Transferable Architectures for Scalable Image Recognition
Barret Zoph
Vijay Vasudevan
Jonathon Shlens
Quoc V. Le
110
5,563
0
21 Jul 2017
RAIL: Risk-Averse Imitation Learning
RAIL: Risk-Averse Imitation Learning
Anirban Santara
A. Naik
Balaraman Ravindran
Dipankar Das
Dheevatsa Mudigere
Sasikanth Avancha
Bharat Kaul
30
18
0
20 Jul 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
157
18,562
0
20 Jul 2017
Imagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement Learning
T. Weber
S. Racanière
David P. Reichert
Lars Buesing
A. Guez
...
Razvan Pascanu
Peter W. Battaglia
Demis Hassabis
David Silver
Daan Wierstra
LM&Ro
54
551
0
19 Jul 2017
Reverse Curriculum Generation for Reinforcement Learning
Reverse Curriculum Generation for Reinforcement Learning
Carlos Florensa
David Held
Markus Wulfmeier
Michael Zhang
Pieter Abbeel
36
438
0
17 Jul 2017
Control of a Quadrotor with Reinforcement Learning
Control of a Quadrotor with Reinforcement Learning
Jemin Hwangbo
Inkyu Sa
Roland Siegwart
Marco Hutter
32
477
0
17 Jul 2017
Efficient Architecture Search by Network Transformation
Efficient Architecture Search by Network Transformation
Han Cai
Tianyao Chen
Weinan Zhang
Yong Yu
Jun Wang
OOD
3DV
34
67
0
16 Jul 2017
ADAPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical
  Systems
ADAPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems
James Harrison
Animesh Garg
Boris Ivanovic
Yuke Zhu
Silvio Savarese
Li Fei-Fei
Marco Pavone
32
25
0
15 Jul 2017
Distral: Robust Multitask Reinforcement Learning
Distral: Robust Multitask Reinforcement Learning
Yee Whye Teh
V. Bapst
Wojciech M. Czarnecki
John Quan
J. Kirkpatrick
R. Hadsell
N. Heess
Razvan Pascanu
65
547
0
13 Jul 2017
Imitation from Observation: Learning to Imitate Behaviors from Raw Video
  via Context Translation
Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation
YuXuan Liu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
61
376
0
11 Jul 2017
A Simple Neural Attentive Meta-Learner
A Simple Neural Attentive Meta-Learner
Nikhil Mishra
Mostafa Rohaninejad
Xi Chen
Pieter Abbeel
OOD
32
199
0
11 Jul 2017
Learning Heuristic Search via Imitation
Learning Heuristic Search via Imitation
M. Bhardwaj
Sanjiban Choudhury
Sebastian Scherer
31
80
0
10 Jul 2017
Robust Imitation of Diverse Behaviors
Robust Imitation of Diverse Behaviors
Ziyun Wang
J. Merel
Scott E. Reed
Greg Wayne
Nando de Freitas
N. Heess
34
195
0
10 Jul 2017
Emergence of Locomotion Behaviours in Rich Environments
Emergence of Locomotion Behaviours in Rich Environments
N. Heess
TB Dhruva
S. Sriram
Jay Lemmon
J. Merel
...
Tom Erez
Ziyun Wang
S. M. Ali Eslami
Martin Riedmiller
David Silver
158
928
0
07 Jul 2017
Learning human behaviors from motion capture by adversarial imitation
Learning human behaviors from motion capture by adversarial imitation
J. Merel
Yuval Tassa
TB Dhruva
S. Srinivasan
Jay Lemmon
Ziyun Wang
Greg Wayne
N. Heess
GAN
27
201
0
07 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
27
106
0
06 Jul 2017
ELF: An Extensive, Lightweight and Flexible Research Platform for
  Real-time Strategy Games
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
Yuandong Tian
Qucheng Gong
Wenling Shang
Yuxin Wu
C. L. Zitnick
OffRL
27
126
0
04 Jul 2017
Teacher-Student Curriculum Learning
Teacher-Student Curriculum Learning
Tambet Matiisen
Avital Oliver
Taco S. Cohen
John Schulman
ODL
38
376
0
01 Jul 2017
Sample-efficient Actor-Critic Reinforcement Learning with Supervised
  Data for Dialogue Management
Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management
Pei-hao Su
Paweł Budzianowski
Stefan Ultes
Milica Gasic
S. Young
OffRL
49
129
0
01 Jul 2017
Noisy Networks for Exploration
Noisy Networks for Exploration
Meire Fortunato
M. G. Azar
Bilal Piot
Jacob Menick
Ian Osband
...
Rémi Munos
Demis Hassabis
Olivier Pietquin
Charles Blundell
Shane Legg
30
889
0
30 Jun 2017
Path Integral Networks: End-to-End Differentiable Optimal Control
Path Integral Networks: End-to-End Differentiable Optimal Control
Masashi Okada
Luca Rigazio
T. Aoshima
PINN
37
56
0
29 Jun 2017
Energy-Based Sequence GANs for Recommendation and Their Connection to
  Imitation Learning
Energy-Based Sequence GANs for Recommendation and Their Connection to Imitation Learning
Jaeyoon Yoo
Heonseok Ha
Jihun Yi
J. Jon Ryu
Chanju Kim
Jung-Woo Ha
Young-Han Kim
Sungroh Yoon
GAN
29
14
0
28 Jun 2017
Count-Based Exploration in Feature Space for Reinforcement Learning
Count-Based Exploration in Feature Space for Reinforcement Learning
Jarryd Martin
S. N. Sasikumar
Tom Everitt
Marcus Hutter
24
122
0
25 Jun 2017
Expected Policy Gradients
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
33
57
0
15 Jun 2017
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
63
3,171
0
12 Jun 2017
Data-Efficient Policy Evaluation Through Behavior Policy Search
Data-Efficient Policy Evaluation Through Behavior Policy Search
Josiah P. Hanna
Philip S. Thomas
Peter Stone
S. Niekum
OffRL
27
39
0
12 Jun 2017
Unlocking the Potential of Simulators: Design with RL in Mind
Unlocking the Potential of Simulators: Design with RL in Mind
Rika Antonova
S. Cruciani
21
2
0
08 Jun 2017
Parameter Space Noise for Exploration
Parameter Space Noise for Exploration
Matthias Plappert
Rein Houthooft
Prafulla Dhariwal
Szymon Sidor
Richard Y. Chen
Xi Chen
Tamim Asfour
Pieter Abbeel
Marcin Andrychowicz
31
593
0
06 Jun 2017
Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known
  Dynamics
Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known Dynamics
Tomoki Nishi
Prashant Doshi
Michael R. James
Danil Prokhorov
30
5
0
04 Jun 2017
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient
  Estimation for Deep Reinforcement Learning
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Bernhard Schölkopf
Sergey Levine
OffRL
35
164
0
01 Jun 2017
The Atari Grand Challenge Dataset
The Atari Grand Challenge Dataset
Vitaly Kurin
Sebastian Nowozin
Katja Hofmann
Lucas Beyer
Bastian Leibe
OffRL
31
43
0
31 May 2017
Constrained Policy Optimization
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
61
1,309
0
30 May 2017
Multi-Modal Imitation Learning from Unstructured Demonstrations using
  Generative Adversarial Nets
Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets
Karol Hausman
Yevgen Chebotar
S. Schaal
Gaurav Sukhatme
Joseph J. Lim
GAN
30
147
0
30 May 2017
Fine-grained acceleration control for autonomous intersection management
  using deep reinforcement learning
Fine-grained acceleration control for autonomous intersection management using deep reinforcement learning
H. Mirzaei
T. Givargis
27
8
0
30 May 2017
Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation
Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation
Guan-Horng Liu
Avinash Siravuru
Sai P. Selvaraj
Manuela Veloso
George Kantor
25
69
0
30 May 2017
Role Playing Learning for Socially Concomitant Mobile Robot Navigation
Role Playing Learning for Socially Concomitant Mobile Robot Navigation
Mingming Li
Rui Jiang
S. Ge
Tong-heng Lee
21
41
0
29 May 2017
Diagonal Rescaling For Neural Networks
Diagonal Rescaling For Neural Networks
Jean Lafond
Nicolas Vasilache
Léon Bottou
14
11
0
25 May 2017
Enhanced Experience Replay Generation for Efficient Reinforcement
  Learning
Enhanced Experience Replay Generation for Efficient Reinforcement Learning
Vincent Huang
Tobias Ley
Martha Vlachou-Konchylaki
Wenfeng Hu
OnRL
GAN
SyDa
24
9
0
23 May 2017
A unified view of entropy-regularized Markov decision processes
A unified view of entropy-regularized Markov decision processes
Gergely Neu
Anders Jonsson
Vicencc Gómez
67
255
0
22 May 2017
Guide Actor-Critic for Continuous Control
Guide Actor-Critic for Continuous Control
Voot Tangkaratt
A. Abdolmaleki
Masashi Sugiyama
26
17
0
22 May 2017
Previous
123...5960616263
Next