ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXivPDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 7,402 papers shown
Title
Guiding Policies with Language via Meta-Learning
Guiding Policies with Language via Meta-Learning
John D. Co-Reyes
Abhishek Gupta
Suvansh Sanjeev
Nick Altieri
Jacob Andreas
John DeNero
Pieter Abbeel
Sergey Levine
LM&Ro
26
63
0
19 Nov 2018
Scalable agent alignment via reward modeling: a research direction
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
54
402
0
19 Nov 2018
Policy Optimization with Model-based Explorations
Policy Optimization with Model-based Explorations
Feiyang Pan
Qingpeng Cai
Anxiang Zeng
C. Pan
Qing Da
Hua-Lin He
Qing He
Pingzhong Tang
36
11
0
18 Nov 2018
Towards Governing Agent's Efficacy: Action-Conditional $β$-VAE for
  Deep Transparent Reinforcement Learning
Towards Governing Agent's Efficacy: Action-Conditional βββ-VAE for Deep Transparent Reinforcement Learning
John Yang
Gyujeong Lee
Minsung Hyun
Simyung Chang
Nojun Kwak
34
3
0
11 Nov 2018
Sample-Efficient Policy Learning based on Completely Behavior Cloning
Sample-Efficient Policy Learning based on Completely Behavior Cloning
Qiming Zou
Ling Wang
K. Lu
Yu Li
OffRL
27
0
0
09 Nov 2018
Meta-Learning for Multi-objective Reinforcement Learning
Meta-Learning for Multi-objective Reinforcement Learning
Xi Chen
Ali Ghadirzadeh
Mårten Björkman
Pablo G. Cámara
OffRL
23
54
0
08 Nov 2018
Correlation Filter Selection for Visual Tracking Using Reinforcement
  Learning
Correlation Filter Selection for Visual Tracking Using Reinforcement Learning
Yanchun Xie
Jimin Xiao
Hassan Jameel Asghar
Jeyarajan Thiyagalingam
Dali Kaafar
23
21
0
08 Nov 2018
RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through
  Imitation
RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through Imitation
Mehdi Letafati
Yuke Zhu
Animesh Garg
Jonathan Booher
Max Spero
...
John Emmons
Anchit Gupta
Emre Orbay
Silvio Savarese
Li Fei-Fei
OffRL
48
288
0
07 Nov 2018
A Closer Look at Deep Policy Gradients
A Closer Look at Deep Policy Gradients
Andrew Ilyas
Logan Engstrom
Shibani Santurkar
Dimitris Tsipras
Firdaus Janoos
Larry Rudolph
Aleksander Madry
30
50
0
06 Nov 2018
Contingency-Aware Exploration in Reinforcement Learning
Contingency-Aware Exploration in Reinforcement Learning
Jongwook Choi
Yijie Guo
Marcin Moczulski
Junhyuk Oh
Neal Wu
Mohammad Norouzi
Honglak Lee
32
73
0
05 Nov 2018
VIREL: A Variational Inference Framework for Reinforcement Learning
VIREL: A Variational Inference Framework for Reinforcement Learning
M. Fellows
Anuj Mahajan
Tim G. J. Rudner
Shimon Whiteson
DRL
38
54
0
03 Nov 2018
Temporal Regularization in Markov Decision Process
Temporal Regularization in Markov Decision Process
Pierre Thodoroff
A. Durand
Joelle Pineau
Doina Precup
30
15
0
01 Nov 2018
Exploration by Random Network Distillation
Exploration by Random Network Distillation
Yuri Burda
Harrison Edwards
Amos Storkey
Oleg Klimov
63
1,309
0
30 Oct 2018
Assessing Generalization in Deep Reinforcement Learning
Assessing Generalization in Deep Reinforcement Learning
Charles Packer
Katelyn Gao
Jernej Kos
Philipp Krahenbuhl
V. Koltun
D. Song
OffRL
34
235
0
29 Oct 2018
One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks
One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks
Tianhe Yu
Pieter Abbeel
Sergey Levine
Chelsea Finn
18
68
0
25 Oct 2018
Inverse reinforcement learning for video games
Inverse reinforcement learning for video games
Aaron David Tucker
Adam Gleave
Stuart J. Russell
24
48
0
24 Oct 2018
RLgraph: Modular Computation Graphs for Deep Reinforcement Learning
RLgraph: Modular Computation Graphs for Deep Reinforcement Learning
Michael Schaarschmidt
Sven Mika
Kai Fricke
Eiko Yoneki
OffRL
23
5
0
21 Oct 2018
Actor-Critic Policy Optimization in Partially Observable Multiagent
  Environments
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments
S. Srinivasan
Marc Lanctot
V. Zambaldi
Julien Perolat
K. Tuyls
Rémi Munos
Michael Bowling
18
148
0
21 Oct 2018
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language
  Learning
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
Maxime Chevalier-Boisvert
Dzmitry Bahdanau
Salem Lahlou
Lucas Willems
Chitwan Saharia
Thien Huu Nguyen
Yoshua Bengio
ELM
47
234
0
18 Oct 2018
Policy Gradient in Partially Observable Environments: Approximation and
  Convergence
Policy Gradient in Partially Observable Environments: Approximation and Convergence
Kamyar Azizzadenesheli
Manish Kumar Bera
Anima Anandkumar
OffRL
46
8
0
18 Oct 2018
Learning Socially Appropriate Robot Approaching Behavior Toward Groups
  using Deep Reinforcement Learning
Learning Socially Appropriate Robot Approaching Behavior Toward Groups using Deep Reinforcement Learning
Yuan Gao
Fangkai Yang
Martin Frisk
Daniel Hernández
Christopher E. Peters
Ginevra Castellano
27
5
0
16 Oct 2018
ProMP: Proximal Meta-Policy Search
ProMP: Proximal Meta-Policy Search
Jonas Rothfuss
Dennis Lee
I. Clavera
Tamim Asfour
Pieter Abbeel
35
209
0
16 Oct 2018
GPU-Accelerated Robotic Simulation for Distributed Reinforcement
  Learning
GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning
Jacky Liang
Viktor Makoviychuk
Ankur Handa
N. Chentanez
Miles Macklin
Dieter Fox
AI4CE
27
182
0
12 Oct 2018
Policy Transfer with Strategy Optimization
Policy Transfer with Strategy Optimization
Wenhao Yu
Chenxi Liu
Greg Turk
43
80
0
12 Oct 2018
Closing the Sim-to-Real Loop: Adapting Simulation Randomization with
  Real World Experience
Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience
Yevgen Chebotar
Ankur Handa
Viktor Makoviychuk
Miles Macklin
J. Issac
Nathan D. Ratliff
Dieter Fox
40
500
0
12 Oct 2018
A Survey and Critique of Multiagent Deep Reinforcement Learning
A Survey and Critique of Multiagent Deep Reinforcement Learning
Pablo Hernandez-Leal
Bilal Kartal
Matthew E. Taylor
OffRL
48
555
0
12 Oct 2018
Parametrized Deep Q-Networks Learning: Reinforcement Learning with
  Discrete-Continuous Hybrid Action Space
Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space
Jiechao Xiong
Qing Wang
Zhuoran Yang
Peng Sun
Lei Han
Yang Zheng
Haobo Fu
Tong Zhang
Ji Liu
Han Liu
37
170
0
10 Oct 2018
Reinforcement Learning for Improving Agent Design
Reinforcement Learning for Improving Agent Design
David R Ha
56
124
0
09 Oct 2018
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
Shariq Iqbal
Fei Sha
29
741
0
05 Oct 2018
PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation
PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation
Perttu Hämäläinen
Amin Babadi
Xiaoxiao Ma
J. Lehtinen
37
62
0
05 Oct 2018
AutoLoss: Learning Discrete Schedules for Alternate Optimization
AutoLoss: Learning Discrete Schedules for Alternate Optimization
Haowen Xu
Huatian Zhang
Zhiting Hu
Xiaodan Liang
Ruslan Salakhutdinov
Eric Xing
32
30
0
04 Oct 2018
Episodic Curiosity through Reachability
Episodic Curiosity through Reachability
Nikolay Savinov
Anton Raichuk
Raphaël Marinier
Damien Vincent
Marc Pollefeys
Timothy Lillicrap
Sylvain Gelly
17
267
0
04 Oct 2018
Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable
  Objects, and Fluids
Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids
Yunzhu Li
Jiajun Wu
Russ Tedrake
J. Tenenbaum
Antonio Torralba
PINN
AI4CE
37
390
0
03 Oct 2018
CEM-RL: Combining evolutionary and gradient-based methods for policy
  search
CEM-RL: Combining evolutionary and gradient-based methods for policy search
Aloïs Pourchot
Olivier Sigaud
37
160
0
02 Oct 2018
The Dreaming Variational Autoencoder for Reinforcement Learning
  Environments
The Dreaming Variational Autoencoder for Reinforcement Learning Environments
Per-Arne Andersen
M. G. Olsen
Ole-Christoffer Granmo
DRL
22
17
0
02 Oct 2018
ChainQueen: A Real-Time Differentiable Physical Simulator for Soft
  Robotics
ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics
Yuanming Hu
Jiancheng Liu
Andrew Spielberg
J. Tenenbaum
William T. Freeman
Jiajun Wu
Daniela Rus
Wojciech Matusik
AI4CE
30
262
0
02 Oct 2018
Bayesian Policy Optimization for Model Uncertainty
Bayesian Policy Optimization for Model Uncertainty
Gilwoo Lee
Brian Hou
Aditya Mandalika
Jeongseok Lee
Sanjiban Choudhury
S. Srinivasa
40
41
0
01 Oct 2018
Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented
  Demonstrations using Directed Information
Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information
Arjun Sharma
Mohit Sharma
Nicholas Rhinehart
Kris Kitani
27
68
0
29 Sep 2018
Boosting Trust Region Policy Optimization by Normalizing Flows Policy
Boosting Trust Region Policy Optimization by Normalizing Flows Policy
Yunhao Tang
Shipra Agrawal
TPM
39
29
0
27 Sep 2018
Scaling simulation-to-real transfer by learning composable robot skills
Scaling simulation-to-real transfer by learning composable robot skills
Ryan Julian
Eric Heiden
Zhanpeng He
Hejia Zhang
S. Schaal
Joseph J. Lim
Gaurav Sukhatme
Karol Hausman
25
15
0
26 Sep 2018
On Reinforcement Learning for Full-length Game of StarCraft
On Reinforcement Learning for Full-length Game of StarCraft
Zhen-Jia Pang
Ruo-Ze Liu
Zhou-Yu Meng
Yuanhang Zhang
Yang Yu
Tong Lu
OffRL
12
88
0
23 Sep 2018
A Learning Framework for High Precision Industrial Assembly
A Learning Framework for High Precision Industrial Assembly
Yongxiang Fan
Jieliang Luo
Masayoshi Tomizuka
OffRL
11
48
0
23 Sep 2018
Fast Motion Planning for High-DOF Robot Systems Using Hierarchical
  System Identification
Fast Motion Planning for High-DOF Robot Systems Using Hierarchical System Identification
Biao Jia
Zherong Pan
Tianyi Zhou
29
5
0
21 Sep 2018
Adversarial Imitation via Variational Inverse Reinforcement Learning
Adversarial Imitation via Variational Inverse Reinforcement Learning
A. H. Qureshi
Byron Boots
Michael C. Yip
22
61
0
17 Sep 2018
Policy Optimization via Importance Sampling
Policy Optimization via Importance Sampling
Alberto Maria Metelli
Matteo Papini
Francesco Faccio
Marcello Restelli
OffRL
53
89
0
17 Sep 2018
Model-Based Reinforcement Learning via Meta-Policy Optimization
Model-Based Reinforcement Learning via Meta-Policy Optimization
I. Clavera
Jonas Rothfuss
John Schulman
Yasuhiro Fujita
Tamim Asfour
Pieter Abbeel
36
225
0
14 Sep 2018
Reinforcement Learning in Topology-based Representation for Human Body
  Movement with Whole Arm Manipulation
Reinforcement Learning in Topology-based Representation for Human Body Movement with Whole Arm Manipulation
Weihao Yuan
Kaiyu Hang
Haoran Song
Danica Kragic
M. Y. Wang
J. A. Stork
22
26
0
12 Sep 2018
Safe Navigation with Human Instructions in Complex Scenes
Safe Navigation with Human Instructions in Complex Scenes
Zhe Hu
Jia Pan
Tingxiang Fan
Ruigang Yang
Tianyi Zhou
32
28
0
12 Sep 2018
Variance Reduction in Monte Carlo Counterfactual Regret Minimization
  (VR-MCCFR) for Extensive Form Games using Baselines
Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Martin Schmid
Neil Burch
Marc Lanctot
Matej Moravcík
Rudolf Kadlec
Michael Bowling
54
64
0
09 Sep 2018
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward
  Bias in Adversarial Imitation Learning
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
Ilya Kostrikov
Kumar Krishna Agrawal
Debidatta Dwibedi
Sergey Levine
Jonathan Tompson
49
258
0
09 Sep 2018
Previous
123...145146147148149
Next