ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.02039
  4. Cited By
Offline Reinforcement Learning as One Big Sequence Modeling Problem

Offline Reinforcement Learning as One Big Sequence Modeling Problem

3 June 2021
Michael Janner
Qiyang Li
Sergey Levine
    OffRL
ArXivPDFHTML

Papers citing "Offline Reinforcement Learning as One Big Sequence Modeling Problem"

50 / 465 papers shown
Title
AutoCAT: Reinforcement Learning for Automated Exploration of
  Cache-Timing Attacks
AutoCAT: Reinforcement Learning for Automated Exploration of Cache-Timing Attacks
Mulong Luo
Wenjie Xiong
G. G. Lee
Yueying Li
Xiaomeng Yang
Amy Zhang
Yuandong Tian
Hsien-Hsin S. Lee
G. E. Suh
AAML
32
10
0
17 Aug 2022
MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control
MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control
Nolan Wagener
Andrey Kolobov
Felipe Vieira Frujeri
Ricky Loynd
Ching-An Cheng
Matthew J. Hausknecht
6
21
0
15 Aug 2022
Diffusion Policies as an Expressive Policy Class for Offline
  Reinforcement Learning
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning
Zhendong Wang
Jonathan J. Hunt
Mingyuan Zhou
OffRL
22
332
0
12 Aug 2022
LATTE: LAnguage Trajectory TransformEr
LATTE: LAnguage Trajectory TransformEr
A. Bucker
Luis F. C. Figueredo
Sami Haddadin
Ashish Kapoor
Shuang Ma
Sai H. Vemprala
Rogerio Bonatti
LM&Ro
24
59
0
04 Aug 2022
AdaCat: Adaptive Categorical Discretization for Autoregressive Models
AdaCat: Adaptive Categorical Discretization for Autoregressive Models
Qiyang Li
Ajay Jain
Pieter Abbeel
OffRL
37
4
0
03 Aug 2022
Addressing Optimism Bias in Sequence Modeling for Reinforcement Learning
Addressing Optimism Bias in Sequence Modeling for Reinforcement Learning
Adam R. Villaflor
Zheng Huang
Swapnil Pande
John M. Dolan
J. Schneider
OffRL
22
23
0
21 Jul 2022
Transformers are Adaptable Task Planners
Transformers are Adaptable Task Planners
Vidhi Jain
Yixin Lin
Eric Undersander
Yonatan Bisk
Akshara Rai
15
24
0
06 Jul 2022
Goal-Conditioned Generators of Deep Policies
Goal-Conditioned Generators of Deep Policies
Francesco Faccio
Vincent Herrmann
Aditya A. Ramesh
Louis Kirsch
Jürgen Schmidhuber
OffRL
25
8
0
04 Jul 2022
Object Representations as Fixed Points: Training Iterative Refinement
  Algorithms with Implicit Differentiation
Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation
Michael Chang
Thomas L. Griffiths
Sergey Levine
OCL
54
59
0
02 Jul 2022
Prompting Decision Transformer for Few-Shot Policy Generalization
Prompting Decision Transformer for Few-Shot Policy Generalization
Mengdi Xu
Yikang Shen
Shun Zhang
Yuchen Lu
Ding Zhao
J. Tenenbaum
Chuang Gan
OffRL
11
136
0
27 Jun 2022
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned
  Reinforcement Learning
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning
Yunfei Li
Tian Gao
Jiaqi Yang
Huazhe Xu
Yi Wu
OffRL
14
22
0
24 Jun 2022
Behavior Transformers: Cloning $k$ modes with one stone
Behavior Transformers: Cloning kkk modes with one stone
Nur Muhammad (Mahi) Shafiullah
Zichen Jeff Cui
Ariuntuya Altanzaya
Lerrel Pinto
OffRL
15
221
0
22 Jun 2022
Generative Pretraining for Black-Box Optimization
Generative Pretraining for Black-Box Optimization
S. Krishnamoorthy
Satvik Mashkaria
Aditya Grover
OffRL
AI4CE
35
26
0
22 Jun 2022
AnyMorph: Learning Transferable Polices By Inferring Agent Morphology
AnyMorph: Learning Transferable Polices By Inferring Agent Morphology
Brandon Trabucco
Mariano Phielipp
Glen Berseth
26
27
0
17 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
  Knowledge
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
42
347
0
17 Jun 2022
Bootstrapped Transformer for Offline Reinforcement Learning
Bootstrapped Transformer for Offline Reinforcement Learning
Kerong Wang
Hanye Zhao
Xufang Luo
Kan Ren
Weinan Zhang
Dongsheng Li
OffRL
11
37
0
17 Jun 2022
Double Check Your State Before Trusting It: Confidence-Aware
  Bidirectional Offline Model-Based Imagination
Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination
Jiafei Lyu
Xiu Li
Zongqing Lu
OffRL
24
24
0
16 Jun 2022
Transformers are Meta-Reinforcement Learners
Transformers are Meta-Reinforcement Learners
L. Melo
OffRL
20
50
0
14 Jun 2022
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Jiafei Lyu
Xiaoteng Ma
Xiu Li
Zongqing Lu
OffRL
23
101
0
09 Jun 2022
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing
Rui Yang
Chenjia Bai
Xiaoteng Ma
Zhaoran Wang
Chongjie Zhang
Lei Han
OffRL
24
74
0
06 Jun 2022
Offline RL for Natural Language Generation with Implicit Language Q
  Learning
Offline RL for Natural Language Generation with Implicit Language Q Learning
Charles Burton Snell
Ilya Kostrikov
Yi Su
Mengjiao Yang
Sergey Levine
OffRL
121
101
0
05 Jun 2022
Incorporating Explicit Uncertainty Estimates into Deep Offline
  Reinforcement Learning
Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning
David Brandfonbrener
Rémi Tachet des Combes
Romain Laroche
OffRL
29
5
0
02 Jun 2022
When does return-conditioned supervised learning work for offline
  reinforcement learning?
When does return-conditioned supervised learning work for offline reinforcement learning?
David Brandfonbrener
A. Bietti
Jacob Buckman
Romain Laroche
Joan Bruna
OffRL
25
60
0
02 Jun 2022
Deep Transformer Q-Networks for Partially Observable Reinforcement
  Learning
Deep Transformer Q-Networks for Partially Observable Reinforcement Learning
Kevin Esslinger
Robert W. Platt
Chris Amato
OffRL
27
32
0
02 Jun 2022
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Muning Wen
J. Kuba
Runji Lin
Weinan Zhang
Ying Wen
J. Wang
Yaodong Yang
26
178
0
30 May 2022
Frustratingly Easy Regularization on Representation Can Boost Deep
  Reinforcement Learning
Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning
Qiang He
Huangyuan Su
Jieyu Zhang
Xinwen Hou
OOD
OffRL
15
6
0
29 May 2022
Task-Agnostic Continual Reinforcement Learning: Gaining Insights and
  Overcoming Challenges
Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges
Massimo Caccia
Jonas W. Mueller
Taesup Kim
Laurent Charlin
Rasool Fakoor
CLL
17
8
0
28 May 2022
Quark: Controllable Text Generation with Reinforced Unlearning
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
47
206
0
26 May 2022
Pessimism in the Face of Confounders: Provably Efficient Offline
  Reinforcement Learning in Partially Observable Markov Decision Processes
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Miao Lu
Yifei Min
Zhaoran Wang
Zhuoran Yang
OffRL
45
22
0
26 May 2022
History Compression via Language Models in Reinforcement Learning
History Compression via Language Models in Reinforcement Learning
Fabian Paischer
Thomas Adler
Vihang Patil
Angela Bitto-Nemling
Markus Holzleitner
Sebastian Lehner
Hamid Eghbalzadeh
Sepp Hochreiter
OffRL
AI4TS
11
42
0
24 May 2022
Chain of Thought Imitation with Procedure Cloning
Chain of Thought Imitation with Procedure Cloning
Mengjiao Yang
Dale Schuurmans
Pieter Abbeel
Ofir Nachum
OffRL
25
29
0
22 May 2022
Planning with Diffusion for Flexible Behavior Synthesis
Planning with Diffusion for Flexible Behavior Synthesis
Michael Janner
Yilun Du
J. Tenenbaum
Sergey Levine
DiffM
202
626
0
20 May 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
54
783
0
12 May 2022
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving
  Simulation
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation
Maximilian Igl
Daewoo Kim
Alex Kuefler
Paul Mougin
Punit Shah
K. Shiarlis
Drago Anguelov
Mark Palatucci
Brandyn White
Shimon Whiteson
19
64
0
06 May 2022
Jump-Start Reinforcement Learning
Jump-Start Reinforcement Learning
Ikechukwu Uchendu
Ted Xiao
Yao Lu
Banghua Zhu
Mengyuan Yan
...
Chuyuan Fu
Cong Ma
Jiantao Jiao
Sergey Levine
Karol Hausman
OffRL
OnRL
28
107
0
05 Apr 2022
Unsupervised Learning of Temporal Abstractions with Slot-based
  Transformers
Unsupervised Learning of Temporal Abstractions with Slot-based Transformers
Anand Gopalakrishnan
Kazuki Irie
Jürgen Schmidhuber
Sjoerd van Steenkiste
OffRL
19
16
0
25 Mar 2022
Reshaping Robot Trajectories Using Natural Language Commands: A Study of
  Multi-Modal Data Alignment Using Transformers
Reshaping Robot Trajectories Using Natural Language Commands: A Study of Multi-Modal Data Alignment Using Transformers
A. Bucker
Luis F. C. Figueredo
Sami Haddadin
Ashish Kapoor
Shuang Ma
Rogerio Bonatti
LM&Ro
16
49
0
25 Mar 2022
Switch Trajectory Transformer with Distributional Value Approximation
  for Multi-Task Reinforcement Learning
Switch Trajectory Transformer with Distributional Value Approximation for Multi-Task Reinforcement Learning
Qinjie Lin
Han Liu
B. Sengupta
OffRL
11
11
0
14 Mar 2022
Context is Everything: Implicit Identification for Dynamics Adaptation
Context is Everything: Implicit Identification for Dynamics Adaptation
Ben Evans
Abitha Thankaraj
Lerrel Pinto
19
20
0
10 Mar 2022
Policy Architectures for Compositional Generalization in Control
Policy Architectures for Compositional Generalization in Control
Allan Zhou
Vikash Kumar
Chelsea Finn
Aravind Rajeswaran
18
22
0
10 Mar 2022
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open
  Problems
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems
Rafael Figueiredo Prudencio
Marcos R. O. A. Máximo
Esther Luna Colombini
OffRL
18
221
0
02 Mar 2022
All You Need Is Supervised Learning: From Imitation Learning to Meta-RL
  With Upside Down RL
All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL
Kai Arulkumaran
Dylan R. Ashley
Jürgen Schmidhuber
R. Srivastava
OffRL
11
6
0
24 Feb 2022
Consistent Dropout for Policy Gradient Reinforcement Learning
Consistent Dropout for Policy Gradient Reinforcement Learning
Matthew J. Hausknecht
Nolan Wagener
OffRL
19
10
0
23 Feb 2022
Learning Relative Return Policies With Upside-Down Reinforcement
  Learning
Learning Relative Return Policies With Upside-Down Reinforcement Learning
Dylan R. Ashley
Kai Arulkumaran
Jürgen Schmidhuber
R. Srivastava
OffRL
11
1
0
23 Feb 2022
VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
Che Wang
Xufang Luo
Keith Ross
Dongsheng Li
OffRL
19
49
0
17 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
6
90
0
13 Feb 2022
Supported Policy Optimization for Offline Reinforcement Learning
Supported Policy Optimization for Offline Reinforcement Learning
Jialong Wu
Haixu Wu
Zihan Qiu
Jianmin Wang
Mingsheng Long
OffRL
27
64
0
13 Feb 2022
Online Decision Transformer
Online Decision Transformer
Qinqing Zheng
Amy Zhang
Aditya Grover
OffRL
16
202
0
11 Feb 2022
SQUIRE: A Sequence-to-sequence Framework for Multi-hop Knowledge Graph
  Reasoning
SQUIRE: A Sequence-to-sequence Framework for Multi-hop Knowledge Graph Reasoning
Yushi Bai
Xin Lv
Juanzi Li
Lei Hou
Yincen Qu
Zelin Dai
Feiyu Xiong
RALM
OffRL
LRM
34
20
0
17 Jan 2022
RvS: What is Essential for Offline RL via Supervised Learning?
RvS: What is Essential for Offline RL via Supervised Learning?
Scott Emmons
Benjamin Eysenbach
Ilya Kostrikov
Sergey Levine
OffRL
15
169
0
20 Dec 2021
Previous
123...1089
Next