ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXivPDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 7,399 papers shown
Title
Investigating Generalisation in Continuous Deep Reinforcement Learning
Investigating Generalisation in Continuous Deep Reinforcement Learning
Chenyang Zhao
Olivier Sigaud
F. Stulp
Timothy M. Hospedales
OffRL
27
48
0
19 Feb 2019
Neural-encoding Human Experts' Domain Knowledge to Warm Start
  Reinforcement Learning
Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement Learning
Andrew Silva
Matthew C. Gombolay
OffRL
32
20
0
15 Feb 2019
Robust Reinforcement Learning in POMDPs with Incomplete and Noisy
  Observations
Robust Reinforcement Learning in POMDPs with Incomplete and Noisy Observations
Yuhui Wang
Hao He
Xiaoyang Tan
30
9
0
15 Feb 2019
Learning to Control Self-Assembling Morphologies: A Study of
  Generalization via Modularity
Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity
Deepak Pathak
Chris Xiaoxuan Lu
Trevor Darrell
Phillip Isola
Alexei A. Efros
17
131
0
14 Feb 2019
Learn a Prior for RHEA for Better Online Planning
Learn a Prior for RHEA for Better Online Planning
Xinyao Tong
W. Liu
Bin Li
OffRL
67
0
0
14 Feb 2019
Non-Asymptotic Analysis of Monte Carlo Tree Search
Non-Asymptotic Analysis of Monte Carlo Tree Search
Devavrat Shah
Qiaomin Xie
Zhi Xu
19
9
0
14 Feb 2019
Deep Reinforcement Learning from Policy-Dependent Human Feedback
Deep Reinforcement Learning from Policy-Dependent Human Feedback
Dilip Arumugam
Jun Ki Lee
S. Saskin
Michael L. Littman
31
94
0
12 Feb 2019
VERIFAI: A Toolkit for the Design and Analysis of Artificial
  Intelligence-Based Systems
VERIFAI: A Toolkit for the Design and Analysis of Artificial Intelligence-Based Systems
T. Dreossi
Daniel J. Fremont
Shromona Ghosh
Edward J. Kim
H. Ravanbakhsh
Marcell Vazquez-Chanlatte
Sanjit A. Seshia
18
29
0
12 Feb 2019
Artificial Intelligence for Prosthetics - challenge solutions
Artificial Intelligence for Prosthetics - challenge solutions
L. Kidzinski
Carmichael F. Ong
Sharada Mohanty
Jennifer Hicks
Sean F. Carroll
...
E. Tumer
J. Watson
M. Salathé
Sergey Levine
Scott L. Delp
20
40
0
07 Feb 2019
A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
Francisco M. Garcia
Philip S. Thomas
42
38
0
03 Feb 2019
Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy
  Reinforcement Learning
Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement Learning
Kyungjae Lee
Sungyub Kim
Sungbin Lim
Sungjoon Choi
Songhwai Oh
32
28
0
31 Jan 2019
Improving Evolutionary Strategies with Generative Neural Networks
Improving Evolutionary Strategies with Generative Neural Networks
Louis Faury
Clément Calauzènes
Olivier Fercoq
Syrine Krichene
29
12
0
31 Jan 2019
Go-Explore: a New Approach for Hard-Exploration Problems
Go-Explore: a New Approach for Hard-Exploration Problems
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
AI4TS
31
364
0
30 Jan 2019
InfoBot: Transfer and Exploration via the Information Bottleneck
InfoBot: Transfer and Exploration via the Information Bottleneck
Anirudh Goyal
Riashat Islam
Daniel Strouse
Zafarali Ahmed
M. Botvinick
Hugo Larochelle
Yoshua Bengio
Sergey Levine
OffRL
14
166
0
30 Jan 2019
Discretizing Continuous Action Space for On-Policy Optimization
Discretizing Continuous Action Space for On-Policy Optimization
Yunhao Tang
Shipra Agrawal
OffRL
26
119
0
29 Jan 2019
Lyapunov-based Safe Policy Optimization for Continuous Control
Lyapunov-based Safe Policy Optimization for Continuous Control
Yinlam Chow
Ofir Nachum
Aleksandra Faust
Edgar A. Duénez-Guzmán
Mohammad Ghavamzadeh
33
245
0
28 Jan 2019
Designing a Multi-Objective Reward Function for Creating Teams of
  Robotic Bodyguards Using Deep Reinforcement Learning
Designing a Multi-Objective Reward Function for Creating Teams of Robotic Bodyguards Using Deep Reinforcement Learning
Hassam Sheikh
Ladislau Bölöni
20
3
0
28 Jan 2019
The Assistive Multi-Armed Bandit
The Assistive Multi-Armed Bandit
Lawrence Chan
Dylan Hadfield-Menell
S. Srinivasa
Anca Dragan
19
36
0
24 Jan 2019
Ablation Studies in Artificial Neural Networks
Ablation Studies in Artificial Neural Networks
Richard Meyes
Melanie Lu
Constantin Waubert de Puiseau
Tobias Meisen
25
210
0
24 Jan 2019
Distillation Strategies for Proximal Policy Optimization
Distillation Strategies for Proximal Policy Optimization
Sam Green
C. Vineyard
Ç. Koç
27
8
0
23 Jan 2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Zhijian Zhang
Haozheng Li
Lu Zhang
Tianyin Zheng
Ting Zhang
Xiong Hao
Xiaoxin Chen
Min Chen
Fangxu Xiao
Wei Zhou
17
15
0
23 Jan 2019
Trust Region Value Optimization using Kalman Filtering
Trust Region Value Optimization using Kalman Filtering
Shirli Di-Castro Shashua
Shie Mannor
24
7
0
23 Jan 2019
Neuroflight: Next Generation Flight Control Firmware
Neuroflight: Next Generation Flight Control Firmware
W. Koch
R. Mancuso
Azer Bestavros
44
30
0
19 Jan 2019
On-Policy Trust Region Policy Optimisation with Replay Buffers
On-Policy Trust Region Policy Optimisation with Replay Buffers
D. Kangin
N. Pugeault
OffRL
19
3
0
18 Jan 2019
Imitation-Regularized Offline Learning
Imitation-Regularized Offline Learning
Yifei Ma
Yu Wang
Balakrishnan
Balakrishnan Narayanaswamy
OffRL
27
22
0
15 Jan 2019
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep
  Reinforcement Learning
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning
Ameer Haj-Ali
Qijing Huang
William S. Moses
J. Xiang
Ion Stoica
Krste Asanović
J. Wawrzynek
29
36
0
15 Jan 2019
Multi-Objective Reinforced Evolution in Mobile Neural Architecture
  Search
Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search
Xiangxiang Chu
Bo Zhang
Ruijun Xu
Hailong Ma
38
98
0
04 Jan 2019
A Theoretical Analysis of Deep Q-Learning
A Theoretical Analysis of Deep Q-Learning
Jianqing Fan
Zhuoran Yang
Yuchen Xie
Zhaoran Wang
46
598
0
01 Jan 2019
Mid-Level Visual Representations Improve Generalization and Sample
  Efficiency for Learning Visuomotor Policies
Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Visuomotor Policies
Alexander Sax
Bradley Emi
Amir Zamir
Leonidas Guibas
Silvio Savarese
Jitendra Malik
SSL
49
16
0
31 Dec 2018
Learn to Interpret Atari Agents
Learn to Interpret Atari Agents
Zhao Yang
S. Bai
Li Zhang
Philip Torr
22
28
0
29 Dec 2018
Learning to Walk via Deep Reinforcement Learning
Learning to Walk via Deep Reinforcement Learning
Tuomas Haarnoja
Sehoon Ha
Aurick Zhou
Jie Tan
George Tucker
Sergey Levine
54
434
0
26 Dec 2018
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for
  Model-based Control
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for Model-based Control
Xingxing Liang
Qi Wang
Yanghe Feng
Zhong Liu
Jincai Huang
31
5
0
24 Dec 2018
TD-Regularized Actor-Critic Methods
TD-Regularized Actor-Critic Methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
OffRL
32
31
0
19 Dec 2018
Hierarchical Macro Strategy Model for MOBA Game AI
Hierarchical Macro Strategy Model for MOBA Game AI
Bin Wu
Qiang Fu
Jing Liang
Peng-fei Qu
Xiaoqian Li
Liang Wang
Wei Liu
Wei Yang
Yongsheng Liu
34
63
0
19 Dec 2018
Learning Montezuma's Revenge from a Single Demonstration
Learning Montezuma's Revenge from a Single Demonstration
Tim Salimans
Richard J. Chen
49
136
0
08 Dec 2018
Communication-Efficient Policy Gradient Methods for Distributed
  Reinforcement Learning
Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning
Tianyi Chen
Kai Zhang
G. Giannakis
Tamer Basar
OffRL
34
41
0
07 Dec 2018
Zero-shot Deep Reinforcement Learning Driving Policy Transfer for
  Autonomous Vehicles based on Robust Control
Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control
Zhuo Xu
Chen Tang
Masayoshi Tomizuka
OffRL
27
35
0
07 Dec 2018
Online Model Distillation for Efficient Video Inference
Online Model Distillation for Efficient Video Inference
Ravi Teja Mullapudi
Steven Chen
Keyi Zhang
Deva Ramanan
Kayvon Fatahalian
VGen
26
114
0
06 Dec 2018
Quantifying Generalization in Reinforcement Learning
Quantifying Generalization in Reinforcement Learning
K. Cobbe
Oleg Klimov
Christopher Hesse
Taehoon Kim
John Schulman
OffRL
59
662
0
06 Dec 2018
Relative Entropy Regularized Policy Iteration
Relative Entropy Regularized Policy Iteration
A. Abdolmaleki
Jost Tobias Springenberg
Jonas Degrave
Steven Bohez
Yuval Tassa
Dan Belov
N. Heess
Martin Riedmiller
27
72
0
05 Dec 2018
Adapting Auxiliary Losses Using Gradient Similarity
Adapting Auxiliary Losses Using Gradient Similarity
Yunshu Du
Wojciech M. Czarnecki
Siddhant M. Jayakumar
Mehrdad Farajtabar
Razvan Pascanu
Balaji Lakshminarayanan
35
156
0
05 Dec 2018
Mitigating Planner Overfitting in Model-Based Reinforcement Learning
Mitigating Planner Overfitting in Model-Based Reinforcement Learning
Dilip Arumugam
David Abel
Kavosh Asadi
N. Gopalan
Christopher Grimm
Jun Ki Lee
Lucas Lehnert
Michael L. Littman
27
11
0
03 Dec 2018
Generative Adversarial Self-Imitation Learning
Generative Adversarial Self-Imitation Learning
Yijie Guo
Junhyuk Oh
Satinder Singh
Honglak Lee
GAN
31
58
0
03 Dec 2018
Hierarchical Policy Design for Sample-Efficient Learning of Robot Table
  Tennis Through Self-Play
Hierarchical Policy Design for Sample-Efficient Learning of Robot Table Tennis Through Self-Play
R. Mahjourian
Navdeep Jaitly
N. Lazić
Sergey Levine
Risto Miikkulainen
32
16
0
30 Nov 2018
Hardware Conditioned Policies for Multi-Robot Transfer Learning
Hardware Conditioned Policies for Multi-Robot Transfer Learning
Tao Chen
Adithyavairavan Murali
Abhinav Gupta
29
102
0
24 Nov 2018
Connecting the Dots Between MLE and RL for Sequence Prediction
Connecting the Dots Between MLE and RL for Sequence Prediction
Bowen Tan
Zhiting Hu
Zichao Yang
Ruslan Salakhutdinov
Eric Xing
42
24
0
24 Nov 2018
Hierarchical visuomotor control of humanoids
Hierarchical visuomotor control of humanoids
J. Merel
Arun Ahuja
Vu Pham
S. Tunyasuvunakool
Siqi Liu
Dhruva Tirumala
N. Heess
Greg Wayne
51
97
0
23 Nov 2018
Guiding Policies with Language via Meta-Learning
Guiding Policies with Language via Meta-Learning
John D. Co-Reyes
Abhishek Gupta
Suvansh Sanjeev
Nick Altieri
Jacob Andreas
John DeNero
Pieter Abbeel
Sergey Levine
LM&Ro
26
63
0
19 Nov 2018
Scalable agent alignment via reward modeling: a research direction
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
54
402
0
19 Nov 2018
Policy Optimization with Model-based Explorations
Policy Optimization with Model-based Explorations
Feiyang Pan
Qingpeng Cai
Anxiang Zeng
C. Pan
Qing Da
Hua-Lin He
Qing He
Pingzhong Tang
36
11
0
18 Nov 2018
Previous
123...144145146147148
Next