Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.02039
Cited By
Offline Reinforcement Learning as One Big Sequence Modeling Problem
3 June 2021
Michael Janner
Qiyang Li
Sergey Levine
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Offline Reinforcement Learning as One Big Sequence Modeling Problem"
50 / 465 papers shown
Title
Sample-efficient Imitative Multi-token Decision Transformer for Generalizable Real World Driving
Hang Zhou
Dan Xu
Yiding Ji
OffRL
32
0
0
18 Jun 2024
Transcendence: Generative Models Can Outperform The Experts That Train Them
Edwin Zhang
Vincent Zhu
Naomi Saphra
Anat Kleiman
Benjamin L. Edelman
Milind Tambe
Sham Kakade
Eran Malach
19
10
0
17 Jun 2024
Generalisation to unseen topologies: Towards control of biological neural network activity
Laurens Engwegen
Daan Brinks
Wendelin Bohmer
MedIm
AI4CE
27
0
0
17 Jun 2024
UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Yuan Pu
Yazhe Niu
Jiyuan Ren
Zhenjie Yang
Hongsheng Li
Yu Liu
OffRL
41
1
0
15 Jun 2024
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
Alexander Nikulin
Ilya Zisman
Alexey Zemtsov
Viacheslav Sinii
105
4
0
13 Jun 2024
Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning
Mohammadreza Nakhaei
Aidan Scannell
J. Pajarinen
OffRL
45
1
0
12 Jun 2024
BAKU: An Efficient Transformer for Multi-Task Policy Learning
Siddhant Haldar
Zhuoran Peng
Lerrel Pinto
OffRL
32
26
0
11 Jun 2024
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer
Chang Chen
Junyeob Baek
Fei Deng
Kenji Kawaguchi
Çağlar Gülçehre
Sungjin Ahn
OffRL
25
1
0
10 Jun 2024
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL
Qi Lv
Xiang Deng
Gongwei Chen
Michael Yu Wang
Liqiang Nie
70
7
0
08 Jun 2024
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
Subhojyoti Mukherjee
Josiah P. Hanna
Qiaomin Xie
Robert Nowak
61
2
0
07 Jun 2024
On Limitation of Transformer for Learning HMMs
Jiachen Hu
Qinghua Liu
Chi Jin
42
3
0
06 Jun 2024
TSPDiffuser: Diffusion Models as Learned Samplers for Traveling Salesperson Path Planning Problems
Ryo Yonetani
39
1
0
05 Jun 2024
Satellites swarm cooperation for pursuit-attachment tasks with transformer-based reinforcement learning
yonghao Li
27
0
0
03 Jun 2024
Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling
Sili Huang
Jifeng Hu
Zhe Yang
Liwei Yang
Tao Luo
Hechang Chen
Lichao Sun
Bo Yang
Mamba
29
3
0
31 May 2024
In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought
Sili Huang
Jifeng Hu
Hechang Chen
Lichao Sun
Bo Yang
OffRL
LRM
25
7
0
31 May 2024
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Linjiajie Fang
Ruoxue Liu
Jing Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
46
1
0
31 May 2024
Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning
Hengkai Tan
Songming Liu
Kai Ma
Chengyang Ying
Xingxing Zhang
Hang Su
Jun Zhu
29
2
0
30 May 2024
Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models
Zeyu Fang
Tian Lan
OffRL
28
2
0
30 May 2024
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
Zhanhui Zhou
Zhixuan Liu
Jie Liu
Zhichen Dong
Chao Yang
Yu Qiao
ALM
36
20
0
29 May 2024
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning
Tianle Zhang
Jiayi Guan
Lin Zhao
Yihang Li
Dongjiang Li
...
Lei Sun
Yue Chen
Xuelong Wei
Lusong Li
Xiaodong He
35
1
0
29 May 2024
Data-Efficient Approach to Humanoid Control via Fine-Tuning a Pre-Trained GPT on Action Data
Siddharth Padmanabhan
Kazuki Miyazawa
Takato Horii
Takayuki Nagai
23
0
0
29 May 2024
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
Shengchao Hu
Ziqing Fan
Li Shen
Ya-Qin Zhang
Yanfeng Wang
Dacheng Tao
OffRL
33
9
0
28 May 2024
Resisting Stochastic Risks in Diffusion Planners with the Trajectory Aggregation Tree
Lang Feng
Pengjie Gu
Bo An
Gang Pan
34
2
0
28 May 2024
Position: Foundation Agents as the Paradigm Shift for Decision Making
Xiaoqian Liu
Xingzhou Lou
Jianbin Jiao
Junge Zhang
OffRL
LLMAG
31
5
0
27 May 2024
DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation
Jinxin Liu
Xinghong Guo
Zifeng Zhuang
Donglin Wang
DiffM
OffRL
42
2
0
23 May 2024
Reinforcing Language Agents via Policy Optimization with Action Decomposition
Muning Wen
Ziyu Wan
Weinan Zhang
Jun Wang
Ying Wen
38
7
0
23 May 2024
Variational Delayed Policy Optimization
Qingyuan Wu
S. Zhan
Yixuan Wang
Yuhui Wang
Chung-Wei Lin
Chen Lv
Qi Zhu
Chao Huang
OffRL
28
4
0
23 May 2024
Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making
Hanzhao Wang
Yu Pan
Fupeng Sun
Shang Liu
K. Talluri
Guanting Chen
Xiaocheng Li
OffRL
53
1
0
23 May 2024
Transformers for Image-Goal Navigation
Nikhilanj Pelluri
ViT
30
0
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
69
41
0
23 May 2024
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?
Yang Dai
Oubo Ma
Longfei Zhang
Xingxing Liang
Shengchao Hu
Mengzhu Wang
Shouling Ji
Jincai Huang
Li Shen
Mamba
31
4
0
20 May 2024
Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning
Hai Zhang
Boyuan Zheng
Anqi Guo
Tianying Ji
Anqi Guo
Junqiao Zhao
Lanqing Li
OffRL
34
0
0
20 May 2024
A Minimalist Prompt for Zero-Shot Policy Learning
Meng Song
Xuezhi Wang
Tanay Biradar
Yao Qin
Manmohan Chandraker
OffRL
19
1
0
09 May 2024
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control
Yide Shentu
Philipp Wu
Aravind Rajeswaran
Pieter Abbeel
32
9
0
08 May 2024
Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows
Minjae Cho
Jonathan P. How
Chuangchuang Sun
OODD
OffRL
30
1
0
06 May 2024
Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning
Aditya A. Ramesh
Kenny Young
Louis Kirsch
Jürgen Schmidhuber
21
1
0
06 May 2024
DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets
Xiaoyu Huang
Yufeng Chi
Ruofeng Wang
Zhongyu Li
Xue Bin Peng
Sophia Shao
Borivoje Nikolic
K. Sreenath
OffRL
75
26
0
30 Apr 2024
Reinforcement Learning Problem Solving with Large Language Models
Sina Gholamian
Domingo Huh
24
0
0
29 Apr 2024
From Cognition to Computation: A Comparative Review of Human Attention and Transformer Architectures
Minglu Zhao
Dehong Xu
Tao Gao
40
4
0
25 Apr 2024
Playing Board Games with the Predict Results of Beam Search Algorithm
Sergey Pastukhov
13
0
0
23 Apr 2024
Transformer Based Planning in the Observation Space with Applications to Trick Taking Card Games
Douglas Rebstock
Christopher Solinas
Nathan R Sturtevant
M. Buro
19
0
0
19 Apr 2024
X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner
Haoyuan Jiang
Ziyue Li
Hua Wei
Xuantang Xiong
Jingqing Ruan
Jiaming Lu
Hangyu Mao
Rui Zhao
25
8
0
18 Apr 2024
Self-adaptive PSRO: Towards an Automatic Population-based Game Solver
Pengdeng Li
Shuxin Li
Chang Yang
Xinrun Wang
Xiao Huang
Hau Chan
Bo An
29
1
0
17 Apr 2024
Offline Trajectory Generalization for Offline Reinforcement Learning
Ziqi Zhao
Zhaochun Ren
Liu Yang
Fajie Yuan
Pengjie Ren
Zhumin Chen
Jun Ma
Xin Xin
OffRL
16
1
0
16 Apr 2024
An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization
Minshuo Chen
Song Mei
Jianqing Fan
Mengdi Wang
VLM
MedIm
DiffM
37
48
0
11 Apr 2024
Generative Probabilistic Planning for Optimizing Supply Chain Networks
Hyung-il Ahn
Santiago Olivar
Hershel Mehta
Young Chol Song
32
0
0
11 Apr 2024
Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning
Xudong Yu
Chenjia Bai
Hongyi Guo
Changhong Wang
Zhen Wang
OffRL
37
0
0
09 Apr 2024
Regularized Conditional Diffusion Model for Multi-Task Preference Alignment
Xudong Yu
Chenjia Bai
Haoran He
Changhong Wang
Xuelong Li
32
6
0
07 Apr 2024
TransformerLSR: Attentive Joint Model of Longitudinal Data, Survival, and Recurrent Events with Concurrent Latent Structure
Zhiyue Zhang
Yao Zhao
Yan Xu
19
0
0
04 Apr 2024
A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches
Zhigen Zhao
Shuo Cheng
Yan Ding
Ziyi Zhou
Shiqi Zhang
Danfei Xu
Ye Zhao
38
22
0
03 Apr 2024
Previous
1
2
3
4
5
6
...
8
9
10
Next