ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXivPDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 7,062 papers shown
Title
Go-Explore: a New Approach for Hard-Exploration Problems
Go-Explore: a New Approach for Hard-Exploration Problems
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
AI4TS
24
363
0
30 Jan 2019
Discretizing Continuous Action Space for On-Policy Optimization
Discretizing Continuous Action Space for On-Policy Optimization
Yunhao Tang
Shipra Agrawal
OffRL
26
118
0
29 Jan 2019
Lyapunov-based Safe Policy Optimization for Continuous Control
Lyapunov-based Safe Policy Optimization for Continuous Control
Yinlam Chow
Ofir Nachum
Aleksandra Faust
Edgar A. Duénez-Guzmán
Mohammad Ghavamzadeh
33
244
0
28 Jan 2019
Designing a Multi-Objective Reward Function for Creating Teams of
  Robotic Bodyguards Using Deep Reinforcement Learning
Designing a Multi-Objective Reward Function for Creating Teams of Robotic Bodyguards Using Deep Reinforcement Learning
Hassam Sheikh
Ladislau Bölöni
20
3
0
28 Jan 2019
The Assistive Multi-Armed Bandit
The Assistive Multi-Armed Bandit
Lawrence Chan
Dylan Hadfield-Menell
S. Srinivasa
Anca Dragan
14
36
0
24 Jan 2019
Ablation Studies in Artificial Neural Networks
Ablation Studies in Artificial Neural Networks
Richard Meyes
Melanie Lu
Constantin Waubert de Puiseau
Tobias Meisen
16
211
0
24 Jan 2019
Distillation Strategies for Proximal Policy Optimization
Distillation Strategies for Proximal Policy Optimization
Sam Green
C. Vineyard
Ç. Koç
27
8
0
23 Jan 2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Zhijian Zhang
Haozheng Li
Lu Zhang
Tianyin Zheng
Ting Zhang
Xiong Hao
Xiaoxin Chen
Min Chen
Fangxu Xiao
Wei Zhou
17
15
0
23 Jan 2019
Trust Region Value Optimization using Kalman Filtering
Trust Region Value Optimization using Kalman Filtering
Shirli Di-Castro Shashua
Shie Mannor
19
7
0
23 Jan 2019
Neuroflight: Next Generation Flight Control Firmware
Neuroflight: Next Generation Flight Control Firmware
W. Koch
R. Mancuso
Azer Bestavros
41
29
0
19 Jan 2019
On-Policy Trust Region Policy Optimisation with Replay Buffers
On-Policy Trust Region Policy Optimisation with Replay Buffers
D. Kangin
N. Pugeault
OffRL
19
3
0
18 Jan 2019
Imitation-Regularized Offline Learning
Imitation-Regularized Offline Learning
Yifei Ma
Yu Wang
Balakrishnan
Balakrishnan Narayanaswamy
OffRL
27
22
0
15 Jan 2019
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep
  Reinforcement Learning
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning
Ameer Haj-Ali
Qijing Huang
William S. Moses
J. Xiang
Ion Stoica
Krste Asanović
J. Wawrzynek
29
36
0
15 Jan 2019
Multi-Objective Reinforced Evolution in Mobile Neural Architecture
  Search
Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search
Xiangxiang Chu
Bo Zhang
Ruijun Xu
Hailong Ma
31
98
0
04 Jan 2019
A Theoretical Analysis of Deep Q-Learning
A Theoretical Analysis of Deep Q-Learning
Jianqing Fan
Zhuoran Yang
Yuchen Xie
Zhaoran Wang
35
597
0
01 Jan 2019
Mid-Level Visual Representations Improve Generalization and Sample
  Efficiency for Learning Visuomotor Policies
Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Visuomotor Policies
Alexander Sax
Bradley Emi
Amir Zamir
Leonidas J. Guibas
Silvio Savarese
Jitendra Malik
SSL
41
16
0
31 Dec 2018
Learning to Walk via Deep Reinforcement Learning
Learning to Walk via Deep Reinforcement Learning
Tuomas Haarnoja
Sehoon Ha
Aurick Zhou
Jie Tan
George Tucker
Sergey Levine
54
433
0
26 Dec 2018
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for
  Model-based Control
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for Model-based Control
Xingxing Liang
Qi Wang
Yanghe Feng
Zhong Liu
Jincai Huang
29
5
0
24 Dec 2018
TD-Regularized Actor-Critic Methods
TD-Regularized Actor-Critic Methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
OffRL
30
32
0
19 Dec 2018
Hierarchical Macro Strategy Model for MOBA Game AI
Hierarchical Macro Strategy Model for MOBA Game AI
Bin Wu
Qiang Fu
Jing Liang
Peng-fei Qu
Xiaoqian Li
Liang Wang
Wei Liu
Wei Yang
Yongsheng Liu
34
63
0
19 Dec 2018
Learning Montezuma's Revenge from a Single Demonstration
Learning Montezuma's Revenge from a Single Demonstration
Tim Salimans
Richard J. Chen
42
136
0
08 Dec 2018
Communication-Efficient Policy Gradient Methods for Distributed
  Reinforcement Learning
Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning
Tianyi Chen
Kai Zhang
G. Giannakis
Tamer Basar
OffRL
29
41
0
07 Dec 2018
Zero-shot Deep Reinforcement Learning Driving Policy Transfer for
  Autonomous Vehicles based on Robust Control
Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control
Zhuo Xu
Chen Tang
Masayoshi Tomizuka
OffRL
27
35
0
07 Dec 2018
Quantifying Generalization in Reinforcement Learning
Quantifying Generalization in Reinforcement Learning
K. Cobbe
Oleg Klimov
Christopher Hesse
Taehoon Kim
John Schulman
OffRL
54
661
0
06 Dec 2018
Relative Entropy Regularized Policy Iteration
Relative Entropy Regularized Policy Iteration
A. Abdolmaleki
Jost Tobias Springenberg
Jonas Degrave
Steven Bohez
Yuval Tassa
Dan Belov
N. Heess
Martin Riedmiller
27
72
0
05 Dec 2018
Adapting Auxiliary Losses Using Gradient Similarity
Adapting Auxiliary Losses Using Gradient Similarity
Yunshu Du
Wojciech M. Czarnecki
Siddhant M. Jayakumar
Mehrdad Farajtabar
Razvan Pascanu
Balaji Lakshminarayanan
35
156
0
05 Dec 2018
Mitigating Planner Overfitting in Model-Based Reinforcement Learning
Mitigating Planner Overfitting in Model-Based Reinforcement Learning
Dilip Arumugam
David Abel
Kavosh Asadi
N. Gopalan
Christopher Grimm
Jun Ki Lee
Lucas Lehnert
Michael L. Littman
24
11
0
03 Dec 2018
Generative Adversarial Self-Imitation Learning
Generative Adversarial Self-Imitation Learning
Yijie Guo
Junhyuk Oh
Satinder Singh
Honglak Lee
GAN
29
58
0
03 Dec 2018
Hierarchical Policy Design for Sample-Efficient Learning of Robot Table
  Tennis Through Self-Play
Hierarchical Policy Design for Sample-Efficient Learning of Robot Table Tennis Through Self-Play
R. Mahjourian
Navdeep Jaitly
N. Lazić
Sergey Levine
Risto Miikkulainen
24
16
0
30 Nov 2018
Hardware Conditioned Policies for Multi-Robot Transfer Learning
Hardware Conditioned Policies for Multi-Robot Transfer Learning
Tao Chen
Adithyavairavan Murali
Abhinav Gupta
21
101
0
24 Nov 2018
Connecting the Dots Between MLE and RL for Sequence Prediction
Connecting the Dots Between MLE and RL for Sequence Prediction
Bowen Tan
Zhiting Hu
Zichao Yang
Ruslan Salakhutdinov
Eric Xing
28
24
0
24 Nov 2018
Hierarchical visuomotor control of humanoids
Hierarchical visuomotor control of humanoids
J. Merel
Arun Ahuja
Vu Pham
S. Tunyasuvunakool
Siqi Liu
Dhruva Tirumala
N. Heess
Greg Wayne
48
97
0
23 Nov 2018
Guiding Policies with Language via Meta-Learning
Guiding Policies with Language via Meta-Learning
John D. Co-Reyes
Abhishek Gupta
Suvansh Sanjeev
Nick Altieri
Jacob Andreas
John DeNero
Pieter Abbeel
Sergey Levine
LM&Ro
26
63
0
19 Nov 2018
Scalable agent alignment via reward modeling: a research direction
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
36
397
0
19 Nov 2018
Policy Optimization with Model-based Explorations
Policy Optimization with Model-based Explorations
Feiyang Pan
Qingpeng Cai
Anxiang Zeng
C. Pan
Qing Da
Hua-Lin He
Qing He
Pingzhong Tang
36
11
0
18 Nov 2018
Towards Governing Agent's Efficacy: Action-Conditional $β$-VAE for
  Deep Transparent Reinforcement Learning
Towards Governing Agent's Efficacy: Action-Conditional βββ-VAE for Deep Transparent Reinforcement Learning
John Yang
Gyujeong Lee
Minsung Hyun
Simyung Chang
Nojun Kwak
29
3
0
11 Nov 2018
Sample-Efficient Policy Learning based on Completely Behavior Cloning
Sample-Efficient Policy Learning based on Completely Behavior Cloning
Qiming Zou
Ling Wang
K. Lu
Yu Li
OffRL
25
0
0
09 Nov 2018
Meta-Learning for Multi-objective Reinforcement Learning
Meta-Learning for Multi-objective Reinforcement Learning
Xi Chen
Ali Ghadirzadeh
Mårten Björkman
Pablo G. Cámara
OffRL
23
54
0
08 Nov 2018
Correlation Filter Selection for Visual Tracking Using Reinforcement
  Learning
Correlation Filter Selection for Visual Tracking Using Reinforcement Learning
Yanchun Xie
Jimin Xiao
Hassan Jameel Asghar
Jeyarajan Thiyagalingam
Dali Kaafar
18
21
0
08 Nov 2018
RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through
  Imitation
RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through Imitation
Mehdi Letafati
Yuke Zhu
Animesh Garg
Jonathan Booher
Max Spero
...
John Emmons
Anchit Gupta
Emre Orbay
Silvio Savarese
Li Fei-Fei
OffRL
48
284
0
07 Nov 2018
A Closer Look at Deep Policy Gradients
A Closer Look at Deep Policy Gradients
Andrew Ilyas
Logan Engstrom
Shibani Santurkar
Dimitris Tsipras
Firdaus Janoos
Larry Rudolph
Aleksander Madry
30
50
0
06 Nov 2018
Contingency-Aware Exploration in Reinforcement Learning
Contingency-Aware Exploration in Reinforcement Learning
Jongwook Choi
Yijie Guo
Marcin Moczulski
Junhyuk Oh
Neal Wu
Mohammad Norouzi
Honglak Lee
27
73
0
05 Nov 2018
VIREL: A Variational Inference Framework for Reinforcement Learning
VIREL: A Variational Inference Framework for Reinforcement Learning
M. Fellows
Anuj Mahajan
Tim G. J. Rudner
Shimon Whiteson
DRL
38
54
0
03 Nov 2018
Temporal Regularization in Markov Decision Process
Temporal Regularization in Markov Decision Process
Pierre Thodoroff
A. Durand
Joelle Pineau
Doina Precup
30
15
0
01 Nov 2018
Assessing Generalization in Deep Reinforcement Learning
Assessing Generalization in Deep Reinforcement Learning
Charles Packer
Katelyn Gao
Jernej Kos
Philipp Krahenbuhl
V. Koltun
D. Song
OffRL
18
233
0
29 Oct 2018
One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks
One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks
Tianhe Yu
Pieter Abbeel
Sergey Levine
Chelsea Finn
18
68
0
25 Oct 2018
RLgraph: Modular Computation Graphs for Deep Reinforcement Learning
RLgraph: Modular Computation Graphs for Deep Reinforcement Learning
Michael Schaarschmidt
Sven Mika
Kai Fricke
Eiko Yoneki
OffRL
23
5
0
21 Oct 2018
Actor-Critic Policy Optimization in Partially Observable Multiagent
  Environments
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments
S. Srinivasan
Marc Lanctot
V. Zambaldi
Julien Perolat
K. Tuyls
Rémi Munos
Michael Bowling
13
148
0
21 Oct 2018
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language
  Learning
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
Maxime Chevalier-Boisvert
Dzmitry Bahdanau
Salem Lahlou
Lucas Willems
Chitwan Saharia
Thien Huu Nguyen
Yoshua Bengio
ELM
38
232
0
18 Oct 2018
Policy Gradient in Partially Observable Environments: Approximation and
  Convergence
Policy Gradient in Partially Observable Environments: Approximation and Convergence
Kamyar Azizzadenesheli
Manish Kumar Bera
Anima Anandkumar
OffRL
30
8
0
18 Oct 2018
Previous
123...138139140141142
Next