ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.00210
  4. Cited By
Mastering Atari Games with Limited Data

Mastering Atari Games with Limited Data

30 October 2021
Weirui Ye
Shao-Wei Liu
Thanard Kurutach
Pieter Abbeel
Yang Gao
    VLM
ArXivPDFHTML

Papers citing "Mastering Atari Games with Limited Data"

50 / 159 papers shown
Title
Is Mamba Compatible with Trajectory Optimization in Offline
  Reinforcement Learning?
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?
Yang Dai
Oubo Ma
Longfei Zhang
Xingxing Liang
Shengchao Hu
Mengzhu Wang
Shouling Ji
Jincai Huang
Li Shen
Mamba
31
4
0
20 May 2024
Efficient Multi-agent Reinforcement Learning by Planning
Efficient Multi-agent Reinforcement Learning by Planning
Qihan Liu
Jianing Ye
Xiaoteng Ma
Jun Yang
Bin Liang
Chongjie Zhang
27
3
0
20 May 2024
Learning Future Representation with Synthetic Observations for
  Sample-efficient Reinforcement Learning
Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning
Xin Liu
Yaran Chen
Dong Zhao
35
1
0
20 May 2024
Feasibility Consistent Representation Learning for Safe Reinforcement
  Learning
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
Zhepeng Cen
Yi-Fan Yao
Zuxin Liu
Ding Zhao
OffRL
32
3
0
20 May 2024
Tree Search-Based Policy Optimization under Stochastic Execution Delay
Tree Search-Based Policy Optimization under Stochastic Execution Delay
David Valensi
E. Derman
Shie Mannor
Gal Dalal
19
3
0
08 Apr 2024
Learning Off-policy with Model-based Intrinsic Motivation For Active
  Online Exploration
Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration
Yibo Wang
Jiang Zhao
OffRL
OnRL
21
0
0
31 Mar 2024
Decision Mamba: Reinforcement Learning via Sequence Modeling with
  Selective State Spaces
Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State Spaces
Toshihiro Ota
Mamba
31
16
0
29 Mar 2024
Reinforcement Learning from Delayed Observations via World Models
Reinforcement Learning from Delayed Observations via World Models
Armin Karamzade
Kyungmin Kim
Montek Kalsi
Roy Fox
33
4
0
18 Mar 2024
MAMBA: an Effective World Model Approach for Meta-Reinforcement Learning
MAMBA: an Effective World Model Approach for Meta-Reinforcement Learning
Zohar Rimon
Tom Jurgenson
Orr Krupnik
Gilad Adler
Aviv Tamar
29
8
0
14 Mar 2024
ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic
  Manipulation
ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
Guanxing Lu
Shiyi Zhang
Ziwei Wang
Changliu Liu
Jiwen Lu
Yansong Tang
44
49
0
13 Mar 2024
Mastering Memory Tasks with World Models
Mastering Memory Tasks with World Models
Mohammad Reza Samsami
Artem Zholus
Janarthanan Rajendran
Sarath Chandar
CLL
OffRL
29
21
0
07 Mar 2024
World Models for Autonomous Driving: An Initial Survey
World Models for Autonomous Driving: An Initial Survey
Yanchen Guan
Haicheng Liao
Zhenning Li
Jia Hu
Runze Yuan
Yunjian Li
Guohui Zhang
Chengzhong Xu
32
31
0
05 Mar 2024
EfficientZero V2: Mastering Discrete and Continuous Control with Limited
  Data
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Shengjie Wang
Shaohuai Liu
Weirui Ye
Jiacheng You
Yang Gao
OffRL
21
10
0
01 Mar 2024
Sample-Efficient Preference-based Reinforcement Learning with Dynamics
  Aware Rewards
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
Miguel Sarabia
Natalie Mackraz
B. Theobald
25
5
0
28 Feb 2024
Improving Token-Based World Models with Parallel Observation Prediction
Improving Token-Based World Models with Parallel Observation Prediction
Lior Cohen
Kaixin Wang
Bingyi Kang
Shie Mannor
18
2
0
08 Feb 2024
Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for
  Offline Reinforcement Learning
Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning
Zihan Ding
Amy Zhang
Yuandong Tian
Qinqing Zheng
OffRL
37
17
0
05 Feb 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice
  via HyperAgent
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Yingru Li
Jiawei Xu
Lei Han
Zhi-Quan Luo
BDL
OffRL
18
6
0
05 Feb 2024
Understanding What Affects Generalization Gap in Visual Reinforcement
  Learning: Theory and Empirical Evidence
Understanding What Affects Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence
Jiafei Lyu
Le Wan
Xiu Li
Zongqing Lu
CML
OffRL
33
2
0
05 Feb 2024
Bridging State and History Representations: Understanding
  Self-Predictive RL
Bridging State and History Representations: Understanding Self-Predictive RL
Tianwei Ni
Benjamin Eysenbach
Erfan Seyedsalehi
Michel Ma
Clement Gehring
Aditya Mahajan
Pierre-Luc Bacon
AI4TS
AI4CE
17
20
0
17 Jan 2024
Decentralized Monte Carlo Tree Search for Partially Observable
  Multi-agent Pathfinding
Decentralized Monte Carlo Tree Search for Partially Observable Multi-agent Pathfinding
Alexey Skrynnik
Anton Andreychuk
Konstantin Yakovlev
Aleksandr I. Panov
28
10
0
26 Dec 2023
World Models via Policy-Guided Trajectory Diffusion
World Models via Policy-Guided Trajectory Diffusion
Marc Rigter
Jun Yamada
Ingmar Posner
21
19
0
13 Dec 2023
TD-MPC2: Scalable, Robust World Models for Continuous Control
TD-MPC2: Scalable, Robust World Models for Continuous Control
Nicklas Hansen
Hao Su
Xiaolong Wang
MU
27
42
0
25 Oct 2023
One is More: Diverse Perspectives within a Single Network for Efficient
  DRL
One is More: Diverse Perspectives within a Single Network for Efficient DRL
Yiqin Tan
Ling Pan
Longbo Huang
OffRL
30
0
0
21 Oct 2023
MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello,
  and Atari Games
MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games
Ti-Rong Wu
Hung Guei
Pei-Chiun Peng
Po-Wei Huang
Ting Han Wei
Chung-Chin Shih
Yun-Jui Tsai
14
7
0
17 Oct 2023
STORM: Efficient Stochastic Transformer based World Models for
  Reinforcement Learning
STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning
Weipu Zhang
Gang Wang
Jian-jun Sun
Yetian Yuan
Gao Huang
61
31
0
14 Oct 2023
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General
  Sequential Decision Scenarios
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
Yazhe Niu
Yuan Pu
Zhenjie Yang
Xueyan Li
Tong Zhou
Jiyuan Ren
Shuai Hu
Hongsheng Li
Yu Liu
85
12
0
12 Oct 2023
Accelerating Monte Carlo Tree Search with Probability Tree State
  Abstraction
Accelerating Monte Carlo Tree Search with Probability Tree State Abstraction
Yangqing Fu
Mingdong Sun
Buqing Nie
Yue Gao
47
3
0
10 Oct 2023
Hieros: Hierarchical Imagination on Structured State Space Sequence
  World Models
Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
Paul Mattes
Rainer Schlosser
R. Herbrich
16
4
0
08 Oct 2023
Language Agent Tree Search Unifies Reasoning Acting and Planning in
  Language Models
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Xiaoxiao Sun
Yang Yang
Michal Shlapentokh-Rothman
Haohan Wang
Yu-xiong Wang
LRM
AI4CE
LM&Ro
LLMAG
34
183
0
06 Oct 2023
Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for
  Decision Making
Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making
Jeonghye Kim
Suyoung Lee
Woojun Kim
Young-Jin Sung
OffRL
31
17
0
04 Oct 2023
HarmonyDream: Task Harmonization Inside World Models
HarmonyDream: Task Harmonization Inside World Models
Haoyu Ma
Jialong Wu
Ningya Feng
Chenjun Xiao
Dong Li
Jianye Hao
Jianmin Wang
Mingsheng Long
33
7
0
30 Sep 2023
GAIA-1: A Generative World Model for Autonomous Driving
GAIA-1: A Generative World Model for Autonomous Driving
Masane Fuchi
Lloyd Russell
Hudson Yeo
Zak Murez
Hiroto Minami
Alex Kendall
Tomohiro Takagi
Gianluca Corrado
VGen
28
215
0
29 Sep 2023
Towards High Efficient Long-horizon Planning with Expert-guided
  Motion-encoding Tree Search
Towards High Efficient Long-horizon Planning with Expert-guided Motion-encoding Tree Search
Tong Zhou
Erli Lyu
Jiaole Wang
Guangdu Cen
Ziqi Zha
Senmao Qi
Max Q.-H. Meng
16
2
0
26 Sep 2023
MoDem-V2: Visuo-Motor World Models for Real-World Robot Manipulation
MoDem-V2: Visuo-Motor World Models for Real-World Robot Manipulation
Patrick E. Lancaster
Nicklas Hansen
Aravind Rajeswaran
Vikash Kumar
LM&Ro
25
14
0
25 Sep 2023
Preference-conditioned Pixel-based AI Agent For Game Testing
Preference-conditioned Pixel-based AI Agent For Game Testing
Sherif M. Abdelfattah
Adrian Brown
Pushi Zhang
6
2
0
18 Aug 2023
AI planning in the imagination: High-level planning on learned abstract
  search spaces
AI planning in the imagination: High-level planning on learned abstract search spaces
Carlos Martin
T. Sandholm
29
0
0
16 Aug 2023
BarlowRL: Barlow Twins for Data-Efficient Reinforcement Learning
BarlowRL: Barlow Twins for Data-Efficient Reinforcement Learning
Omer Veysel Cagatan
Barış Akgün
BDL
OffRL
21
3
0
08 Aug 2023
Thinker: Learning to Plan and Act
Thinker: Learning to Plan and Act
Stephen Chung
Ivan Anokhin
David M. Krueger
LLMAG
OffRL
LRM
22
5
0
27 Jul 2023
Monte-Carlo Tree Search for Multi-Agent Pathfinding: Preliminary Results
Monte-Carlo Tree Search for Multi-Agent Pathfinding: Preliminary Results
Yelisey Pitanov
Alexey Skrynnik
Anton Andreychuk
Konstantin Yakovlev
Aleksandr I. Panov
16
2
0
25 Jul 2023
Reparameterized Policy Learning for Multimodal Trajectory Optimization
Reparameterized Policy Learning for Multimodal Trajectory Optimization
Zhiao Huang
Litian Liang
Z. Ling
Xuanlin Li
Chuang Gan
H. Su
21
10
0
20 Jul 2023
SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained
  Networks
SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained Networks
Xingyu Lin
John So
Sashwat Mahalingam
Fangchen Liu
Pieter Abbeel
SSL
22
21
0
07 Jul 2023
$λ$-models: Effective Decision-Aware Reinforcement Learning with
  Latent Models
λλλ-models: Effective Decision-Aware Reinforcement Learning with Latent Models
C. Voelcker
Arash Ahmadian
Romina Abachi
Igor Gilitschenski
Amir-massoud Farahmand
51
0
0
30 Jun 2023
Is Pre-training Truly Better Than Meta-Learning?
Is Pre-training Truly Better Than Meta-Learning?
Brando Miranda
P. Yu
Saumya Goyal
Yu-xiong Wang
Oluwasanmi Koyejo
39
5
0
24 Jun 2023
Beyond Scale: the Diversity Coefficient as a Data Quality Metric
  Demonstrates LLMs are Pre-trained on Formally Diverse Data
Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data
Alycia Lee
Brando Miranda
Sudharsan Sundar
Sanmi Koyejo
32
6
0
24 Jun 2023
TACO: Temporal Latent Action-Driven Contrastive Loss for Visual
  Reinforcement Learning
TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning
Ruijie Zheng
Xiyao Wang
Yanchao Sun
Shuang Ma
Jieyu Zhao
Huazhe Xu
Hal Daumé
Furong Huang
43
35
0
22 Jun 2023
PLASTIC: Improving Input and Label Plasticity for Sample Efficient
  Reinforcement Learning
PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning
Hojoon Lee
Hanseul Cho
Hyunseung Kim
Daehoon Gwak
Joonkee Kim
Jaegul Choo
Se-Young Yun
Chulhee Yun
OffRL
76
25
0
19 Jun 2023
Residual Q-Learning: Offline and Online Policy Customization without
  Value
Residual Q-Learning: Offline and Online Policy Customization without Value
Chenran Li
Chen Tang
Haruki Nishimura
Jean-Pierre Mercat
M. Tomizuka
Wei Zhan
OffRL
25
6
0
15 Jun 2023
Simplified Temporal Consistency Reinforcement Learning
Simplified Temporal Consistency Reinforcement Learning
Yi Zhao
Wenshuai Zhao
Rinu Boney
Juho Kannala
J. Pajarinen
OffRL
25
12
0
15 Jun 2023
Agents Explore the Environment Beyond Good Actions to Improve Their
  Model for Better Decisions
Agents Explore the Environment Beyond Good Actions to Improve Their Model for Better Decisions
Matthias Unverzagt
LLMAG
22
0
0
06 Jun 2023
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining
Minting Pan
Yitao Zheng
Yunbo Wang
Xiaokang Yang
OffRL
21
0
0
06 Jun 2023
Previous
1234
Next