Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.06858
Cited By
Soft policy optimization using dual-track advantage estimator
15 September 2020
Yubo Huang
Xuechun Wang
Luobao Zou
Zhiwei Zhuang
Weidong Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Soft policy optimization using dual-track advantage estimator"
5 / 5 papers shown
Title
Solving Rubik's Cube with a Robot Hand
OpenAI
Ilge Akkaya
Marcin Andrychowicz
Maciek Chociej
Ma-teusz Litwin
...
Peter Welinder
Lilian Weng
Qiming Yuan
Wojciech Zaremba
Lei Zhang
ODL
53
1,215
0
16 Oct 2019
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
134
8,805
0
04 Feb 2016
Prioritized Experience Replay
Tom Schaul
John Quan
Ioannis Antonoglou
David Silver
OffRL
164
3,777
0
18 Nov 2015
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
73
7,590
0
22 Sep 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
97
13,174
0
09 Sep 2015
1