ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.03381
  4. Cited By
Learning Montezuma's Revenge from a Single Demonstration

Learning Montezuma's Revenge from a Single Demonstration

8 December 2018
Tim Salimans
Richard J. Chen
ArXivPDFHTML

Papers citing "Learning Montezuma's Revenge from a Single Demonstration"

42 / 92 papers shown
Title
TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL
TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL
Clément Romac
Rémy Portelas
Katja Hofmann
Pierre-Yves Oudeyer
27
21
0
17 Mar 2021
Provably Breaking the Quadratic Error Compounding Barrier in Imitation
  Learning, Optimally
Provably Breaking the Quadratic Error Compounding Barrier in Imitation Learning, Optimally
Nived Rajaraman
Yanjun Han
Lin F. Yang
Kannan Ramchandran
Jiantao Jiao
19
14
0
25 Feb 2021
Asymmetric self-play for automatic goal discovery in robotic
  manipulation
Asymmetric self-play for automatic goal discovery in robotic manipulation
OpenAI OpenAI
Matthias Plappert
Raul Sampedro
Tao Xu
Ilge Akkaya
...
Hyeonwoo Noh
Lilian Weng
Qiming Yuan
Casey Chu
Wojciech Zaremba
SSL
82
76
0
13 Jan 2021
Augmenting Policy Learning with Routines Discovered from a Single
  Demonstration
Augmenting Policy Learning with Routines Discovered from a Single Demonstration
Zelin Zhao
Chuang Gan
Jiajun Wu
Xiaoxiao Guo
J. Tenenbaum
OffRL
11
5
0
23 Dec 2020
Diluted Near-Optimal Expert Demonstrations for Guiding Dialogue
  Stochastic Policy Optimisation
Diluted Near-Optimal Expert Demonstrations for Guiding Dialogue Stochastic Policy Optimisation
Thibault Cordier
Tanguy Urvoy
L. Rojas-Barahona
F. Lefèvre
29
5
0
25 Nov 2020
Meta Automatic Curriculum Learning
Meta Automatic Curriculum Learning
Rémy Portelas
Clément Romac
Katja Hofmann
Pierre-Yves Oudeyer
35
8
0
16 Nov 2020
Lucid Dreaming for Experience Replay: Refreshing Past States with the
  Current Policy
Lucid Dreaming for Experience Replay: Refreshing Past States with the Current Policy
Yunshu Du
Garrett A. Warnell
A. Gebremedhin
Peter Stone
Matthew E. Taylor
19
10
0
29 Sep 2020
Evolutionary Selective Imitation: Interpretable Agents by Imitation
  Learning Without a Demonstrator
Evolutionary Selective Imitation: Interpretable Agents by Imitation Learning Without a Demonstrator
Roy Eliya
J. Herrmann
14
2
0
17 Sep 2020
Curriculum Learning with Hindsight Experience Replay for Sequential
  Object Manipulation Tasks
Curriculum Learning with Hindsight Experience Replay for Sequential Object Manipulation Tasks
Binyamin Manela
Armin Biess
37
28
0
21 Aug 2020
Guided Exploration with Proximal Policy Optimization using a Single
  Demonstration
Guided Exploration with Proximal Policy Optimization using a Single Demonstration
Gabriele Libardi
Gianni De Fabritiis
4
24
0
07 Jul 2020
Curriculum learning for multilevel budgeted combinatorial problems
Curriculum learning for multilevel budgeted combinatorial problems
Adel Nabli
Margarida Carvalho
AI4CE
11
4
0
07 Jul 2020
Adaptive Procedural Task Generation for Hard-Exploration Problems
Adaptive Procedural Task Generation for Hard-Exploration Problems
Kuan Fang
Yuke Zhu
Silvio Savarese
Li Fei-Fei
19
24
0
01 Jul 2020
The NetHack Learning Environment
The NetHack Learning Environment
Heinrich Küttler
Nantas Nardelli
Alexander H. Miller
Roberta Raileanu
Marco Selvatici
Edward Grefenstette
Tim Rocktaschel
20
177
0
24 Jun 2020
Reparameterized Variational Divergence Minimization for Stable Imitation
Reparameterized Variational Divergence Minimization for Stable Imitation
Dilip Arumugam
Debadeepta Dey
Alekh Agarwal
Asli Celikyilmaz
E. Nouri
W. Dolan
30
3
0
18 Jun 2020
Rinascimento: using event-value functions for playing Splendor
Rinascimento: using event-value functions for playing Splendor
Ivan Bravi
Simon Lucas
22
2
0
10 Jun 2020
A Survey of Algorithms for Black-Box Safety Validation of Cyber-Physical
  Systems
A Survey of Algorithms for Black-Box Safety Validation of Cyber-Physical Systems
Anthony Corso
Robert J. Moss
Mark Koren
Ritchie Lee
Mykel J. Kochenderfer
19
169
0
06 May 2020
First return, then explore
First return, then explore
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
47
350
0
27 Apr 2020
PBCS : Efficient Exploration and Exploitation Using a Synergy between
  Reinforcement Learning and Motion Planning
PBCS : Efficient Exploration and Exploitation Using a Synergy between Reinforcement Learning and Motion Planning
Guillaume Matheron
Nicolas Perrin
Olivier Sigaud
7
18
0
24 Apr 2020
Show Us the Way: Learning to Manage Dialog from Demonstrations
Show Us the Way: Learning to Manage Dialog from Demonstrations
Gabriel Gordon-Hall
P. Gorinski
Gerasimos Lampouras
Ignacio Iacobacci
OffRL
15
11
0
17 Apr 2020
Reinforcement Learning via Reasoning from Demonstration
Reinforcement Learning via Reasoning from Demonstration
Lisa A. Torrey
6
1
0
12 Apr 2020
Adaptive Stress Testing without Domain Heuristics using Go-Explore
Adaptive Stress Testing without Domain Heuristics using Go-Explore
Mark Koren
Mykel J. Kochenderfer
9
17
0
08 Apr 2020
Automatic Curriculum Learning For Deep RL: A Short Survey
Automatic Curriculum Learning For Deep RL: A Short Survey
Rémy Portelas
Cédric Colas
Lilian Weng
Katja Hofmann
Pierre-Yves Oudeyer
ODL
19
167
0
10 Mar 2020
Exploration Based Language Learning for Text-Based Games
Exploration Based Language Learning for Text-Based Games
Andrea Madotto
Mahdi Namazifar
Joost Huizinga
Piero Molino
Adrien Ecoffet
H. Zheng
Alexandros Papangelis
Dian Yu
Chandra Khatri
Gokhan Tur
LLMAG
12
31
0
24 Jan 2020
Deep Reinforcement Learning for Complex Manipulation Tasks with Sparse
  Feedback
Deep Reinforcement Learning for Complex Manipulation Tasks with Sparse Feedback
Binyamin Manela
13
0
0
12 Jan 2020
Predictive Coding for Boosting Deep Reinforcement Learning with Sparse
  Rewards
Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards
Xingyu Lu
Stas Tiomkin
Pieter Abbeel
OffRL
31
4
0
21 Dec 2019
Adaptive Leader-Follower Formation Control and Obstacle Avoidance via
  Deep Reinforcement Learning
Adaptive Leader-Follower Formation Control and Obstacle Avoidance via Deep Reinforcement Learning
Yanlin Zhou
F. Lu
George Pu
Xiyao Ma
Runhan Sun
Hsi-Yuan Chen
Xiaolin Li
D. Wu
14
19
0
15 Nov 2019
ZPD Teaching Strategies for Deep Reinforcement Learning from
  Demonstrations
ZPD Teaching Strategies for Deep Reinforcement Learning from Demonstrations
Daniel Seita
David M. Chan
Roshan Rao
Chen Tang
Mandi Zhao
John F. Canny
17
12
0
26 Oct 2019
Model-based Reinforcement Learning for Predictions and Control for Limit
  Order Books
Model-based Reinforcement Learning for Predictions and Control for Limit Order Books
Haoran Wei
Yuanbo Wang
L. Mangu
Keith S. Decker
13
24
0
09 Oct 2019
Visual Tracking by means of Deep Reinforcement Learning and an Expert
  Demonstrator
Visual Tracking by means of Deep Reinforcement Learning and an Expert Demonstrator
Matteo Dunnhofer
N. Martinel
G. Foresti
C. Micheloni
OffRL
31
31
0
18 Sep 2019
Making Efficient Use of Demonstrations to Solve Hard Exploration
  Problems
Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
T. Paine
Çağlar Gülçehre
Bobak Shahriari
Misha Denil
Matt Hoffman
...
Duncan Williams
Gabriel Barth-Maron
Ziyun Wang
Nando de Freitas
Worlds Team
20
80
0
03 Sep 2019
Memory Based Trajectory-conditioned Policies for Learning from Sparse
  Rewards
Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards
Yijie Guo
Jongwook Choi
Marcin Moczulski
Shengyu Feng
Samy Bengio
Mohammad Norouzi
Honglak Lee
17
10
0
24 Jul 2019
Towards Finding Longer Proofs
Towards Finding Longer Proofs
Zsolt Zombori
Adrián Csiszárik
Henryk Michalewski
C. Kaliszyk
Josef Urban
OffRL
LRM
29
15
0
30 May 2019
AI-GAs: AI-generating algorithms, an alternate paradigm for producing
  general artificial intelligence
AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence
Jeff Clune
17
115
0
27 May 2019
Toybox: A Suite of Environments for Experimental Evaluation of Deep
  Reinforcement Learning
Toybox: A Suite of Environments for Experimental Evaluation of Deep Reinforcement Learning
Emma Tosch
Kaleigh Clary
John Foley
David D. Jensen
OffRL
17
9
0
07 May 2019
Competitive Experience Replay
Competitive Experience Replay
Hao Liu
Alexander R. Trott
R. Socher
Caiming Xiong
OffRL
27
52
0
01 Feb 2019
Go-Explore: a New Approach for Hard-Exploration Problems
Go-Explore: a New Approach for Hard-Exploration Problems
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
AI4TS
24
361
0
30 Jan 2019
Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep
  RL
Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL
Bilal Kartal
Pablo Hernandez-Leal
Matthew E. Taylor
OffRL
33
9
0
30 Nov 2018
Exploring Restart Distributions
Exploring Restart Distributions
Arash Tavakoli
Vitaly Levdik
Riashat Islam
Christopher M. Smith
Petar Kormushev
OffRL
6
5
0
27 Nov 2018
Exploration by Random Network Distillation
Exploration by Random Network Distillation
Yuri Burda
Harrison Edwards
Amos Storkey
Oleg Klimov
37
1,295
0
30 Oct 2018
Expert-augmented actor-critic for ViZDoom and Montezumas Revenge
Expert-augmented actor-critic for ViZDoom and Montezumas Revenge
Michal Garmulewicz
Henryk Michalewski
Piotr Milos
16
8
0
10 Sep 2018
Backplay: "Man muss immer umkehren"
Backplay: "Man muss immer umkehren"
Cinjon Resnick
R. Raileanu
Sanyam Kapoor
Alex Peysakhovich
Kyunghyun Cho
Joan Bruna
OffRL
28
45
0
18 Jul 2018
DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based
  Character Skills
DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
Xue Bin Peng
Pieter Abbeel
Sergey Levine
M. van de Panne
AI4CE
175
494
0
08 Apr 2018
Previous
12