Leveraging the Variance of Return Sequences for Exploration Policy

17 November 2020

Papers citing "Leveraging the Variance of Return Sequences for Exploration Policy"

17 / 17 papers shown

Title
Information-Directed Exploration for Deep Reinforcement Learning Nikolay Nikolov Johannes Kirschner Felix Berkenkamp Andreas Krause 36 69 0 18 Dec 2018
Provably Efficient Maximum Entropy Exploration Elad Hazan Sham Kakade Karan Singh A. V. Soest 36 295 0 06 Dec 2018
Distributional Reinforcement Learning with Quantile Regression Will Dabney Mark Rowland Marc G. Bellemare Rémi Munos 61 753 0 27 Oct 2017
A Distributional Perspective on Reinforcement Learning Marc G. Bellemare Will Dabney Rémi Munos OffRL 57 1,497 0 21 Jul 2017
Noisy Networks for Exploration Meire Fortunato M. G. Azar Bilal Piot Jacob Menick Ian Osband ... Rémi Munos Demis Hassabis Olivier Pietquin Charles Blundell Shane Legg 45 890 0 30 Jun 2017
Parameter Space Noise for Exploration Matthias Plappert Rein Houthooft Prafulla Dhariwal Szymon Sidor Richard Y. Chen Xi Chen Tamim Asfour Pieter Abbeel Marcin Andrychowicz 38 594 0 06 Jun 2017
Curiosity-driven Exploration by Self-supervised Prediction Deepak Pathak Pulkit Agrawal Alexei A. Efros Trevor Darrell LRM SSL 87 2,416 0 15 May 2017
Count-Based Exploration with Neural Density Models Georg Ostrovski Marc G. Bellemare Aaron van den Oord Rémi Munos 59 616 0 03 Mar 2017
Reinforcement Learning with Deep Energy-Based Policies Tuomas Haarnoja Haoran Tang Pieter Abbeel Sergey Levine 35 1,329 0 27 Feb 2017
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning Haoran Tang Rein Houthooft Davis Foote Adam Stooke Xi Chen Yan Duan John Schulman F. Turck Pieter Abbeel OffRL 69 764 0 15 Nov 2016
Unifying Count-Based Exploration and Intrinsic Motivation Marc G. Bellemare S. Srinivasan Georg Ostrovski Tom Schaul D. Saxton Rémi Munos 136 1,465 0 06 Jun 2016
Deep Exploration via Bootstrapped DQN Ian Osband Charles Blundell Alexander Pritzel Benjamin Van Roy 38 1,302 0 15 Feb 2016
Dueling Network Architectures for Deep Reinforcement Learning Ziyun Wang Tom Schaul Matteo Hessel H. V. Hasselt Marc Lanctot Nando de Freitas OffRL 50 3,742 0 20 Nov 2015
Prioritized Experience Replay Tom Schaul John Quan Ioannis Antonoglou David Silver OffRL 164 3,777 0 18 Nov 2015
Deep Reinforcement Learning with Double Q-learning H. V. Hasselt A. Guez David Silver OffRL 73 7,590 0 22 Sep 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 104 149,474 0 22 Dec 2014
The Arcade Learning Environment: An Evaluation Platform for General Agents Marc G. Bellemare Yavar Naddaf J. Veness Michael Bowling 38 2,989 0 19 Jul 2012