ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.04021
  4. Cited By
On the role of planning in model-based deep reinforcement learning
v1v2 (latest)

On the role of planning in model-based deep reinforcement learning

8 November 2020
Jessica B. Hamrick
A. Friesen
Feryal M. P. Behbahani
A. Guez
Fabio Viola
Sims Witherspoon
Thomas W. Anthony
Lars Buesing
Petar Velickovic
T. Weber
    OffRL
ArXiv (abs)PDFHTML

Papers citing "On the role of planning in model-based deep reinforcement learning"

50 / 50 papers shown
Bootstrap Off-policy with World Model
Bootstrap Off-policy with World Model
Guojian Zhan
Likun Wang
Xiangteng Zhang
Jiaxin Gao
Masayoshi Tomizuka
Shengbo Eben Li
OffRLOnRL
482
2
0
01 Nov 2025
Path Channels and Plan Extension Kernels: a Mechanistic Description of Planning in a Sokoban RNN
Path Channels and Plan Extension Kernels: a Mechanistic Description of Planning in a Sokoban RNN
Mohammad Taufeeque
Aaron David Tucker
Adam Gleave
Adrià Garriga-Alonso
313
0
0
11 Jun 2025
Trust-Region Twisted Policy Improvement
Trust-Region Twisted Policy Improvement
Joery A. de Vries
Jinke He
Yaniv Oren
M. Spaan
OffRLLRM
555
2
0
08 Apr 2025
Extendable Planning via Multiscale Diffusion
Extendable Planning via Multiscale Diffusion
Chang Chen
Hany Hamed
Doojin Baek
Taegu Kang
Samyeul Noh
Yoshua Bengio
Sungjin Ahn
490
4
0
25 Mar 2025
On-line Policy Improvement using Monte-Carlo Search
On-line Policy Improvement using Monte-Carlo SearchNeural Information Processing Systems (NeurIPS), 1996
Gerald Tesauro
Gregory R. Galperin
460
276
0
09 Jan 2025
Demystifying MuZero Planning: Interpreting the Learned Model
Demystifying MuZero Planning: Interpreting the Learned ModelIEEE Transactions on Artificial Intelligence (IEEE TAI), 2024
Hung Guei
Yan-Ru Ju
Wei-Yu Chen
Tai-Lin Wu
327
2
0
07 Nov 2024
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Jiayu Chen
Wentse Chen
Shiyu Huang
Jeff Schneider
OffRL
469
8
0
15 Oct 2024
How to Choose a Reinforcement-Learning Algorithm
How to Choose a Reinforcement-Learning Algorithm
Fabian Bongratz
Vladimir Golkov
Lukas Mautner
Luca Della Libera
Frederik Heetmeyer
Felix Czaja
Julian Rodemann
Daniel Cremers
228
2
0
30 Jul 2024
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning
  in LLMs
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs
Xuan Zhang
Chao Du
Tianyu Pang
Qian Liu
Wei Gao
Min Lin
LRMAI4CE
327
132
0
13 Jun 2024
Learning to Play Atari in a World of Tokens
Learning to Play Atari in a World of Tokens
Pranav Agarwal
Sheldon Andrews
Samira Ebrahimi Kahou
OffRL
262
6
0
03 Jun 2024
Dynamic Model Predictive Shielding for Provably Safe Reinforcement
  Learning
Dynamic Model Predictive Shielding for Provably Safe Reinforcement Learning
Arko Banerjee
Kia Rahmani
Joydeep Biswas
Işıl Dillig
237
9
0
22 May 2024
How does the primate brain combine generative and discriminative
  computations in vision?
How does the primate brain combine generative and discriminative computations in vision?
Benjamin Peters
J. DiCarlo
Todd Gureckis
Ralf Haefner
Leyla Isik
...
Kimberly Stachenfeld
Zenna Tavares
Doris Y. Tsao
Ilker Yildirim
N. Kriegeskorte
262
8
0
11 Jan 2024
Simple Hierarchical Planning with Diffusion
Simple Hierarchical Planning with Diffusion
Chang Chen
Fei Deng
Kenji Kawaguchi
Çağlar Gülçehre
Sungjin Ahn
OffRLDiffM
287
73
0
05 Jan 2024
Predictive auxiliary objectives in deep RL mimic learning in the brain
Predictive auxiliary objectives in deep RL mimic learning in the brainInternational Conference on Learning Representations (ICLR), 2023
Ching Fang
Kimberly L. Stachenfeld
310
16
0
09 Oct 2023
Efficient Planning with Latent Diffusion
Efficient Planning with Latent DiffusionInternational Conference on Learning Representations (ICLR), 2023
Wenhao Li
DiffM
435
10
0
30 Sep 2023
Thinker: Learning to Plan and Act
Thinker: Learning to Plan and ActNeural Information Processing Systems (NeurIPS), 2023
Stephen Chung
Ivan Anokhin
David M. Krueger
LLMAGOffRLLRM
317
12
0
27 Jul 2023
What model does MuZero learn?
What model does MuZero learn?European Conference on Artificial Intelligence (ECAI), 2023
Jinke He
Thomas M. Moerland
F. Oliehoek
357
5
0
01 Jun 2023
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control
  via Sample Multiple Reuse
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple ReuseInformation Sciences (Inf. Sci.), 2023
Jiafei Lyu
Le Wan
Zongqing Lu
Xiu Li
OffRL
216
17
0
29 May 2023
The Update-Equivalence Framework for Decision-Time Planning
The Update-Equivalence Framework for Decision-Time PlanningInternational Conference on Learning Representations (ICLR), 2023
Samuel Sokota
Gabriele Farina
David J. Wu
Hengyuan Hu
Kevin A. Wang
J. Zico Kolter
Noam Brown
345
5
0
25 Apr 2023
Equivariant MuZero
Equivariant MuZero
Andreea Deac
T. Weber
George Papamakarios
235
4
0
09 Feb 2023
Learning Interaction-aware Motion Prediction Model for Decision-making
  in Autonomous Driving
Learning Interaction-aware Motion Prediction Model for Decision-making in Autonomous Driving
Zhiyu Huang
Haochen Liu
Jingda Wu
Wenhui Huang
Chen Lv
244
23
0
08 Feb 2023
PushWorld: A benchmark for manipulation planning with tools and movable
  obstacles
PushWorld: A benchmark for manipulation planning with tools and movable obstacles
Ken Kansky
Skanda Vaidyanath
Scott Swingle
Xinghua Lou
Miguel Lazaro-Gredilla
Dileep George
357
4
0
24 Jan 2023
Safe Reinforcement Learning using Data-Driven Predictive Control
Safe Reinforcement Learning using Data-Driven Predictive ControlInternational Conference on Communications, Signal Processing, and their Applications (ICCSPA), 2022
Mahmoud Selim
Amr Alanwar
M. El-Kharashi
Hazem Abbas
Karl H. Johansson
OffRL
248
7
0
20 Nov 2022
Continuous Monte Carlo Graph Search
Continuous Monte Carlo Graph SearchAdaptive Agents and Multi-Agent Systems (AAMAS), 2022
Kalle Kujanpää
Amin Babadi
Yi Zhao
Arno Solin
Alexander Ilin
Joni Pajarinen
LRM
973
3
0
04 Oct 2022
Simplifying Model-based RL: Learning Representations, Latent-space
  Models, and Policies with One Objective
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One ObjectiveInternational Conference on Learning Representations (ICLR), 2022
Raj Ghugare
Homanga Bharadhwaj
Benjamin Eysenbach
Sergey Levine
Ruslan Salakhutdinov
OffRL
387
31
0
18 Sep 2022
A model-based approach to meta-Reinforcement Learning: Transformers and
  tree search
A model-based approach to meta-Reinforcement Learning: Transformers and tree searchThe European Symposium on Artificial Neural Networks (ESANN), 2022
Brieuc Pinon
Jean-Charles Delvenne
Raphaël Jungers
OffRL
230
4
0
24 Aug 2022
Efficient Planning in a Compact Latent Action Space
Efficient Planning in a Compact Latent Action SpaceInternational Conference on Learning Representations (ICLR), 2022
Zhengyao Jiang
Tianjun Zhang
Michael Janner
Yueying Li
Tim Rocktaschel
Edward Grefenstette
Yuandong Tian
OffRL
320
57
0
22 Aug 2022
Intelligent problem-solving as integrated hierarchical reinforcement
  learning
Intelligent problem-solving as integrated hierarchical reinforcement learningNature Machine Intelligence (Nat. Mach. Intell.), 2022
Manfred Eppe
Christian Gumbsch
Matthias Kerzel
Phuong D. H. Nguyen
Martin Volker Butz
S. Wermter
293
91
0
18 Aug 2022
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving
  Simulation
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving SimulationIEEE International Conference on Robotics and Automation (ICRA), 2022
Maximilian Igl
Daewoo Kim
Alex Kuefler
Paul Mougin
Punit Shah
K. Shiarlis
Drago Anguelov
Mark Palatucci
Brandyn White
Shimon Whiteson
259
81
0
06 May 2022
Physical Design using Differentiable Learned Simulators
Physical Design using Differentiable Learned Simulators
Kelsey R. Allen
Tatiana López-Guevara
Kimberly L. Stachenfeld
Alvaro Sanchez-Gonzalez
Peter W. Battaglia
Jessica B. Hamrick
Tobias Pfaff
AI4CE
277
51
0
01 Feb 2022
Inferring perceptual decision making parameters from behavior in
  production and reproduction tasks
Inferring perceptual decision making parameters from behavior in production and reproduction tasks
Nils Neupärtl
Constantin Rothkopf
191
1
0
31 Dec 2021
Learning Generalizable Behavior via Visual Rewrite Rules
Learning Generalizable Behavior via Visual Rewrite Rules
Yiheng Xie
Mingxuan Li
Shangqun Yu
Michael Littman
DRL
231
1
0
09 Dec 2021
Procedural Generalization by Planning with Self-Supervised World Models
Procedural Generalization by Planning with Self-Supervised World ModelsInternational Conference on Learning Representations (ICLR), 2021
Ankesh Anand
Jacob Walker
Yazhe Li
Eszter Vértes
Julian Schrittwieser
Sherjil Ozair
T. Weber
Jessica B. Hamrick
191
34
0
02 Nov 2021
Self-Consistent Models and Values
Self-Consistent Models and ValuesNeural Information Processing Systems (NeurIPS), 2021
Roy Miles
Kate Baumli
Zita Marinho
Angelos Filos
Matteo Hessel
Hado van Hasselt
David Silver
257
8
0
25 Oct 2021
Model-based Reinforcement Learning for Service Mesh Fault Resiliency in
  a Web Application-level
Model-based Reinforcement Learning for Service Mesh Fault Resiliency in a Web Application-levelApplied and Computational Engineering (ACE), 2021
Fanfei Meng
L. Jagadeesan
M. Thottan
AI4CE
122
14
0
21 Oct 2021
Neural Algorithmic Reasoners are Implicit Planners
Neural Algorithmic Reasoners are Implicit PlannersNeural Information Processing Systems (NeurIPS), 2021
Andreea Deac
Petar Velivcković
Ognjen Milinković
Pierre-Luc Bacon
Jian Tang
Mladen Nikolic
OffRL
174
26
0
11 Oct 2021
Evaluating model-based planning and planner amortization for continuous
  control
Evaluating model-based planning and planner amortization for continuous control
Arunkumar Byravan
Leonard Hasenclever
Piotr Trochim
M. Berk Mirza
Alessandro Davide Ialongo
...
Jost Tobias Springenberg
A. Abdolmaleki
N. Heess
J. Merel
Martin Riedmiller
192
18
0
07 Oct 2021
Potential-based Reward Shaping in Sokoban
Potential-based Reward Shaping in Sokoban
Zhao Yang
Mike Preuss
Aske Plaat
OffRL
176
3
0
10 Sep 2021
Subgoal Search For Complex Reasoning Tasks
Subgoal Search For Complex Reasoning TasksNeural Information Processing Systems (NeurIPS), 2021
K. Czechowski
Tomasz Odrzygó'zd'z
Marek Zbysiñski
Michał Zawalski
Krzysztof Olejnik
Yuhuai Wu
Lukasz Kuciñski
Piotr Milo's
ReLMLRM
267
40
0
25 Aug 2021
Deep Multiagent Reinforcement Learning: Challenges and Directions
Deep Multiagent Reinforcement Learning: Challenges and DirectionsArtificial Intelligence Review (AIR), 2021
Annie Wong
Thomas Bäck
Anna V. Kononova
Aske Plaat
AI4CE
304
159
0
29 Jun 2021
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement
  Learning
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Mingde Zhao
Zhen Liu
Sitao Luan
Shuyuan Zhang
Doina Precup
Yoshua Bengio
466
40
0
03 Jun 2021
Towards Deeper Deep Reinforcement Learning with Spectral Normalization
Towards Deeper Deep Reinforcement Learning with Spectral NormalizationNeural Information Processing Systems (NeurIPS), 2021
Johan Bjorck
Daniel Schwalbe-Koda
Kilian Q. Weinberger
349
26
0
02 Jun 2021
Learning Neuro-Symbolic Relational Transition Models for Bilevel
  Planning
Learning Neuro-Symbolic Relational Transition Models for Bilevel PlanningIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2021
Rohan Chitnis
Tom Silver
J. Tenenbaum
Tomas Lozano-Perez
L. Kaelbling
371
68
0
28 May 2021
Transfer Learning and Curriculum Learning in Sokoban
Transfer Learning and Curriculum Learning in Sokoban
Zhao Yang
Mike Preuss
Aske Plaat
OffRL
290
3
0
25 May 2021
MBRL-Lib: A Modular Library for Model-based Reinforcement Learning
MBRL-Lib: A Modular Library for Model-based Reinforcement Learning
Luis Pineda
Brandon Amos
Amy Zhang
Nathan Lambert
Roberto Calandra
OffRL
370
53
0
20 Apr 2021
Muesli: Combining Improvements in Policy Optimization
Muesli: Combining Improvements in Policy OptimizationInternational Conference on Machine Learning (ICML), 2021
Matteo Hessel
Ivo Danihelka
Fabio Viola
A. Guez
Simon Schmitt
Laurent Sifre
T. Weber
David Silver
H. V. Hasselt
274
70
0
13 Apr 2021
Planning and Learning Using Adaptive Entropy Tree Search
Planning and Learning Using Adaptive Entropy Tree SearchIEEE International Joint Conference on Neural Network (IJCNN), 2021
Piotr Kozakowski
Mikolaj Pacek
Piotr Milo's
210
3
0
12 Feb 2021
Autotelic Agents with Intrinsically Motivated Goal-Conditioned
  Reinforcement Learning: a Short Survey
Autotelic Agents with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short SurveyJournal of Artificial Intelligence Research (JAIR), 2020
Cédric Colas
Tristan Karch
Olivier Sigaud
Pierre-Yves Oudeyer
903
122
0
17 Dec 2020
On the model-based stochastic value gradient for continuous
  reinforcement learning
On the model-based stochastic value gradient for continuous reinforcement learningConference on Learning for Dynamics & Control (L4DC), 2020
Brandon Amos
Samuel Stanton
Denis Yarats
A. Wilson
416
78
0
28 Aug 2020
A Unifying Framework for Reinforcement Learning and Planning
A Unifying Framework for Reinforcement Learning and Planning
Thomas M. Moerland
Joost Broekens
Aske Plaat
Catholijn M. Jonker
OffRL
535
14
0
26 Jun 2020
1
Page 1 of 1