ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXivPDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 7,213 papers shown
Title
AutoLoss: Learning Discrete Schedules for Alternate Optimization
AutoLoss: Learning Discrete Schedules for Alternate Optimization
Haowen Xu
Huatian Zhang
Zhiting Hu
Xiaodan Liang
Ruslan Salakhutdinov
Eric Xing
32
30
0
04 Oct 2018
Episodic Curiosity through Reachability
Episodic Curiosity through Reachability
Nikolay Savinov
Anton Raichuk
Raphaël Marinier
Damien Vincent
Marc Pollefeys
Timothy Lillicrap
Sylvain Gelly
17
267
0
04 Oct 2018
Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable
  Objects, and Fluids
Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids
Yunzhu Li
Jiajun Wu
Russ Tedrake
J. Tenenbaum
Antonio Torralba
PINN
AI4CE
37
390
0
03 Oct 2018
CEM-RL: Combining evolutionary and gradient-based methods for policy
  search
CEM-RL: Combining evolutionary and gradient-based methods for policy search
Aloïs Pourchot
Olivier Sigaud
37
160
0
02 Oct 2018
The Dreaming Variational Autoencoder for Reinforcement Learning
  Environments
The Dreaming Variational Autoencoder for Reinforcement Learning Environments
Per-Arne Andersen
M. G. Olsen
Ole-Christoffer Granmo
DRL
22
17
0
02 Oct 2018
ChainQueen: A Real-Time Differentiable Physical Simulator for Soft
  Robotics
ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics
Yuanming Hu
Jiancheng Liu
Andrew Spielberg
J. Tenenbaum
William T. Freeman
Jiajun Wu
Daniela Rus
Wojciech Matusik
AI4CE
30
262
0
02 Oct 2018
Bayesian Policy Optimization for Model Uncertainty
Bayesian Policy Optimization for Model Uncertainty
Gilwoo Lee
Brian Hou
Aditya Mandalika
Jeongseok Lee
Sanjiban Choudhury
S. Srinivasa
22
41
0
01 Oct 2018
Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented
  Demonstrations using Directed Information
Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information
Arjun Sharma
Mohit Sharma
Nicholas Rhinehart
Kris Kitani
27
68
0
29 Sep 2018
Boosting Trust Region Policy Optimization by Normalizing Flows Policy
Boosting Trust Region Policy Optimization by Normalizing Flows Policy
Yunhao Tang
Shipra Agrawal
TPM
39
29
0
27 Sep 2018
Scaling simulation-to-real transfer by learning composable robot skills
Scaling simulation-to-real transfer by learning composable robot skills
Ryan Julian
Eric Heiden
Zhanpeng He
Hejia Zhang
S. Schaal
Joseph J. Lim
Gaurav Sukhatme
Karol Hausman
25
15
0
26 Sep 2018
On Reinforcement Learning for Full-length Game of StarCraft
On Reinforcement Learning for Full-length Game of StarCraft
Zhen-Jia Pang
Ruo-Ze Liu
Zhou-Yu Meng
Yuanhang Zhang
Yang Yu
Tong Lu
OffRL
10
88
0
23 Sep 2018
Fast Motion Planning for High-DOF Robot Systems Using Hierarchical
  System Identification
Fast Motion Planning for High-DOF Robot Systems Using Hierarchical System Identification
Biao Jia
Zherong Pan
Tianyi Zhou
29
5
0
21 Sep 2018
Adversarial Imitation via Variational Inverse Reinforcement Learning
Adversarial Imitation via Variational Inverse Reinforcement Learning
A. H. Qureshi
Byron Boots
Michael C. Yip
22
61
0
17 Sep 2018
Policy Optimization via Importance Sampling
Policy Optimization via Importance Sampling
Alberto Maria Metelli
Matteo Papini
Francesco Faccio
Marcello Restelli
OffRL
26
89
0
17 Sep 2018
Model-Based Reinforcement Learning via Meta-Policy Optimization
Model-Based Reinforcement Learning via Meta-Policy Optimization
I. Clavera
Jonas Rothfuss
John Schulman
Yasuhiro Fujita
Tamim Asfour
Pieter Abbeel
30
225
0
14 Sep 2018
Reinforcement Learning in Topology-based Representation for Human Body
  Movement with Whole Arm Manipulation
Reinforcement Learning in Topology-based Representation for Human Body Movement with Whole Arm Manipulation
Weihao Yuan
Kaiyu Hang
Haoran Song
Danica Kragic
M. Y. Wang
J. A. Stork
22
26
0
12 Sep 2018
Safe Navigation with Human Instructions in Complex Scenes
Safe Navigation with Human Instructions in Complex Scenes
Zhe Hu
Jia Pan
Tingxiang Fan
Ruigang Yang
Tianyi Zhou
32
28
0
12 Sep 2018
Variance Reduction in Monte Carlo Counterfactual Regret Minimization
  (VR-MCCFR) for Extensive Form Games using Baselines
Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Martin Schmid
Neil Burch
Marc Lanctot
Matej Moravcík
Rudolf Kadlec
Michael Bowling
34
64
0
09 Sep 2018
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward
  Bias in Adversarial Imitation Learning
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
Ilya Kostrikov
Kumar Krishna Agrawal
Debidatta Dwibedi
Sergey Levine
Jonathan Tompson
43
257
0
09 Sep 2018
Unity: A General Platform for Intelligent Agents
Unity: A General Platform for Intelligent Agents
Arthur Juliani
Vincent-Pierre Berges
Esh Vckay
Andrew Cohen
Jonathan Harper
...
Chris Goy
Yuan Gao
Hunter Henry
Marwan Mattar
Danny Lange
44
809
0
07 Sep 2018
ANS: Adaptive Network Scaling for Deep Rectifier Reinforcement Learning
  Models
ANS: Adaptive Network Scaling for Deep Rectifier Reinforcement Learning Models
Yueh-hua Wu
Fan-Yun Sun
Yen-Yu Chang
Shou-De Lin
17
5
0
06 Sep 2018
Gibson Env: Real-World Perception for Embodied Agents
Gibson Env: Real-World Perception for Embodied Agents
F. Xia
Amir Zamir
Zhi-Yang He
Alexander Sax
Jitendra Malik
Silvio Savarese
AI4CE
LM&Ro
34
817
0
31 Aug 2018
Importance mixing: Improving sample reuse in evolutionary policy search
  methods
Importance mixing: Improving sample reuse in evolutionary policy search methods
Aloïs Pourchot
Nicolas Perrin
Olivier Sigaud
30
14
0
17 Aug 2018
Policy Optimization as Wasserstein Gradient Flows
Policy Optimization as Wasserstein Gradient Flows
Ruiyi Zhang
Changyou Chen
Chunyuan Li
Lawrence Carin
33
66
0
09 Aug 2018
Learning Actionable Representations from Visual Observations
Learning Actionable Representations from Visual Observations
Debidatta Dwibedi
Jonathan Tompson
Corey Lynch
P. Sermanet
SSL
22
80
0
02 Aug 2018
Learning Dexterous In-Hand Manipulation
Learning Dexterous In-Hand Manipulation
OpenAI OpenAI
Marcin Andrychowicz
Bowen Baker
Maciek Chociej
Rafal Jozefowicz
...
Szymon Sidor
Joshua Tobin
Peter Welinder
Lilian Weng
Wojciech Zaremba
52
1,859
0
01 Aug 2018
ToriLLE: Learning Environment for Hand-to-Hand Combat
ToriLLE: Learning Environment for Hand-to-Hand Combat
Anssi Kanervisto
Ville Hautamaki
34
2
0
26 Jul 2018
Multi-Agent Reinforcement Learning: A Report on Challenges and
  Approaches
Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches
Sanyam Kapoor
27
31
0
25 Jul 2018
Meta-Learning Priors for Efficient Online Bayesian Regression
Meta-Learning Priors for Efficient Online Bayesian Regression
James Harrison
Apoorva Sharma
Marco Pavone
BDL
37
100
0
24 Jul 2018
Online Robust Policy Learning in the Presence of Unknown Adversaries
Online Robust Policy Learning in the Presence of Unknown Adversaries
Aaron J. Havens
Zhanhong Jiang
Soumik Sarkar
AAML
28
43
0
16 Jul 2018
Hierarchical Reinforcement Learning Framework towards Multi-agent
  Navigation
Hierarchical Reinforcement Learning Framework towards Multi-agent Navigation
Wenhao Ding
Shuaijun Li
Huihuan Qian
28
32
0
14 Jul 2018
Deep Learning in the Wild
Deep Learning in the Wild
Thilo Stadelmann
Mohammadreza Amirian
Ismail Arabaci
M. Arnold
G. Duivesteijn
...
Melanie Geiger
Stefan Lörwald
B. Meier
Katharina Rombach
Lukas Tuggener
24
42
0
13 Jul 2018
Automatically Composing Representation Transformations as a Means for
  Generalization
Automatically Composing Representation Transformations as a Means for Generalization
Michael Chang
Abhishek Gupta
Sergey Levine
Thomas Griffiths
31
68
0
12 Jul 2018
Variance Reduction for Reinforcement Learning in Input-Driven
  Environments
Variance Reduction for Reinforcement Learning in Input-Driven Environments
Hongzi Mao
S. Venkatakrishnan
Malte Schwarzkopf
Mohammad Alizadeh
OffRL
41
95
0
06 Jul 2018
BOHB: Robust and Efficient Hyperparameter Optimization at Scale
BOHB: Robust and Efficient Hyperparameter Optimization at Scale
Stefan Falkner
Aaron Klein
Frank Hutter
BDL
54
1,077
0
04 Jul 2018
Using Reinforcement Learning with Partial Vehicle Detection for
  Intelligent Traffic Signal Control
Using Reinforcement Learning with Partial Vehicle Detection for Intelligent Traffic Signal Control
Rusheng Zhang
A. Ishikawa
Wenli Wang
Benjamin Striner
Ozan Tonguz
32
101
0
04 Jul 2018
Human-level performance in first-person multiplayer games with
  population-based deep reinforcement learning
Human-level performance in first-person multiplayer games with population-based deep reinforcement learning
Max Jaderberg
Wojciech M. Czarnecki
Iain Dunning
Luke Marris
Guy Lever
...
Joel Z Leibo
David Silver
Demis Hassabis
Koray Kavukcuoglu
T. Graepel
OffRL
43
716
0
03 Jul 2018
Towards Mixed Optimization for Reinforcement Learning with Program
  Synthesis
Towards Mixed Optimization for Reinforcement Learning with Program Synthesis
Surya Bhupatiraju
Kumar Krishna Agrawal
Rishabh Singh
16
6
0
01 Jul 2018
A Dissection of Overfitting and Generalization in Continuous
  Reinforcement Learning
A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning
Amy Zhang
Nicolas Ballas
Joelle Pineau
CLL
OffRL
33
177
0
20 Jun 2018
RUDDER: Return Decomposition for Delayed Rewards
RUDDER: Return Decomposition for Delayed Rewards
Jose A. Arjona-Medina
Michael Gillhofer
Michael Widrich
Thomas Unterthiner
Johannes Brandstetter
Sepp Hochreiter
42
215
0
20 Jun 2018
Learning Policy Representations in Multiagent Systems
Learning Policy Representations in Multiagent Systems
Aditya Grover
Maruan Al-Shedivat
Jayesh K. Gupta
Yuri Burda
Harrison Edwards
AI4CE
29
123
0
17 Jun 2018
BaRC: Backward Reachability Curriculum for Robotic Reinforcement
  Learning
BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning
Boris Ivanovic
James Harrison
Apoorva Sharma
Mo Chen
Marco Pavone
OffRL
37
57
0
16 Jun 2018
Maximum a Posteriori Policy Optimisation
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
48
471
0
14 Jun 2018
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement
  Learning with Trajectory Embeddings
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings
John D. Co-Reyes
YuXuan Liu
Abhishek Gupta
Benjamin Eysenbach
Pieter Abbeel
Sergey Levine
SSL
BDL
AIFin
37
142
0
07 Jun 2018
Graph Convolutional Policy Network for Goal-Directed Molecular Graph
  Generation
Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation
Jiaxuan You
Bowen Liu
Rex Ying
Vijay S. Pande
J. Leskovec
GNN
215
891
0
07 Jun 2018
Neural Control Variates for Variance Reduction
Neural Control Variates for Variance Reduction
Ruosi Wan
Mingjun Zhong
Haoyi Xiong
Zhanxing Zhu
BDL
DRL
27
18
0
01 Jun 2018
Supervised Policy Update for Deep Reinforcement Learning
Supervised Policy Update for Deep Reinforcement Learning
Q. Vuong
Yiming Zhang
Keith Ross
19
20
0
29 May 2018
Learning Self-Imitating Diverse Policies
Learning Self-Imitating Diverse Policies
Tanmay Gangwani
Qiang Liu
Jian Peng
29
65
0
25 May 2018
Parallel Architecture and Hyperparameter Search via Successive Halving
  and Classification
Parallel Architecture and Hyperparameter Search via Successive Halving and Classification
Manoj Kumar
George E. Dahl
Vijay Vasudevan
Mohammad Norouzi
36
25
0
25 May 2018
Object-Oriented Dynamics Predictor
Object-Oriented Dynamics Predictor
Guangxiang Zhu
Zhiao Huang
Chongjie Zhang
AI4CE
24
36
0
25 May 2018
Previous
123...142143144145
Next