Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1707.06347
Cited By
v1
v2 (latest)
Proximal Policy Optimization Algorithms
20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Proximal Policy Optimization Algorithms"
18 / 11,418 papers shown
Multi-task Learning with Gradient Guided Policy Specialization
Wenhao Yu
Chenxi Liu
Greg Turk
98
3
0
23 Sep 2017
Expanding Motor Skills through Relay Neural Networks
Visak C. V. Kumar
Sehoon Ha
Chenxi Liu
52
2
0
22 Sep 2017
Neural Optimizer Search with Reinforcement Learning
Irwan Bello
Barret Zoph
Vijay Vasudevan
Quoc V. Le
ODL
255
400
0
21 Sep 2017
Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning
Maximilian Hüttenrauch
Adrian Šošić
Gerhard Neumann
84
3
0
21 Sep 2017
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Peter Henderson
Wei-Di Chang
Pierre-Luc Bacon
David Meger
Joelle Pineau
Doina Precup
GAN
152
76
0
20 Sep 2017
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
473
2,139
0
19 Sep 2017
Learning Sampling Distributions for Robot Motion Planning
Brian Ichter
James Harrison
Marco Pavone
273
388
0
16 Sep 2017
TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow
Danijar Hafner
James Davidson
Vincent Vanhoucke
OffRL
185
52
0
08 Sep 2017
Mirror Descent Search and its Acceleration
Megumi Miyashita
S. Yano
T. Kondo
121
7
0
08 Sep 2017
Deep Learning for Video Game Playing
Niels Justesen
Philip Bontrager
Julian Togelius
S. Risi
VLM
259
227
0
25 Aug 2017
A Brief Survey of Deep Reinforcement Learning
Kai Arulkumaran
M. Deisenroth
Miles Brundage
Anil Anthony Bharath
OffRL
389
2,830
0
19 Aug 2017
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
349
659
0
17 Aug 2017
A Machine Learning Approach to Routing
Asaf Valadarsky
Michael Schapira
Dafna Shahaf
Aviv Tamar
153
39
0
10 Aug 2017
An Information-Theoretic Optimality Principle for Deep Reinforcement Learning
Felix Leibfried
Jordi Grau-Moya
Haitham Bou-Ammar
351
24
0
06 Aug 2017
Learning Transferable Architectures for Scalable Image Recognition
Barret Zoph
Vijay Vasudevan
Jonathon Shlens
Quoc V. Le
738
6,001
0
21 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
208
113
0
06 Jul 2017
Teacher-Student Curriculum Learning
Tambet Matiisen
Avital Oliver
Taco S. Cohen
John Schulman
ODL
460
425
0
01 Jul 2017
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Sai Li
Pieter Abbeel
954
7,476
0
19 Feb 2015
Previous
1
2
3
...
227
228
229