Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2406.05064
Cited By
v1
v2
v3 (latest)
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
7 June 2024
Subhojyoti Mukherjee
Josiah P. Hanna
Qiaomin Xie
Robert D. Nowak
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning"
50 / 91 papers shown
Title
In-Context Curiosity: Distilling Exploration for Decision-Pretrained Transformers on Bandit Tasks
Huitao Yang
Guanting Chen
OffRL
48
0
0
30 Sep 2025
In-Context Compositional Q-Learning for Offline Reinforcement Learning
Qiushui Xu
Yuhao Huang
Yushu Jiang
Lei Song
Jinyu Wang
Wenliang Zheng
Jiang Bian
OffRL
84
0
0
28 Sep 2025
HVAC-DPT: A Decision Pretrained Transformer for HVAC Control
Anaïs Berkes
AI4CE
319
1
0
29 Nov 2024
Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems
Neural Information Processing Systems (NeurIPS), 2024
Yue Kang
Cho-Jui Hsieh
T. C. Lee
283
22
0
14 Jan 2024
Self-supervised Pretraining for Decision Foundation Model: Formulation, Pipeline and Challenges
Xiaoqian Liu
Jianbin Jiao
Junge Zhang
OffRL
LRM
283
2
0
29 Dec 2023
In-Context Reinforcement Learning for Variable Action Spaces
Viacheslav Sinii
Alexander Nikulin
Vladislav Kurenkov
Ilya Zisman
Sergey Kolesnikov
629
22
0
20 Dec 2023
Multi-task Representation Learning for Pure Exploration in Bilinear Bandits
Neural Information Processing Systems (NeurIPS), 2023
Subhojyoti Mukherjee
Qiaomin Xie
Josiah P. Hanna
Robert D. Nowak
329
6
0
01 Nov 2023
Rethinking Decision Transformer via Hierarchical Reinforcement Learning
International Conference on Machine Learning (ICML), 2023
Yi-An Ma
Chenjun Xiao
Hebin Liang
Jianye Hao
OffRL
170
13
0
01 Nov 2023
Transformers are Provably Optimal In-context Estimators for Wireless Communications
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Vishnu Teja Kunde
Vicram Rajagopalan
Chandra Shekhara Kaushik Valmeekam
Krishna R. Narayanan
S. Shakkottai
D. Kalathil
J. Chamberland
488
10
0
01 Nov 2023
Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining
International Conference on Learning Representations (ICLR), 2023
Licong Lin
Yu Bai
Song Mei
OffRL
265
66
0
12 Oct 2023
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
Zhihan Liu
Hao Hu
Shenao Zhang
Hongyi Guo
Shuqi Ke
Boyi Liu
Zhaoran Wang
LLMAG
LRM
384
44
0
29 Sep 2023
Bayesian Low-rank Adaptation for Large Language Models
International Conference on Learning Representations (ICLR), 2023
Adam X. Yang
Maxime Robeyns
Xi Wang
Laurence Aitchison
AI4CE
BDL
591
80
0
24 Aug 2023
ExpeL: LLM Agents Are Experiential Learners
AAAI Conference on Artificial Intelligence (AAAI), 2023
Andrew Zhao
Daniel Huang
Quentin Xu
Matthieu Lin
Wenshu Fan
Gao Huang
LLMAG
387
325
0
20 Aug 2023
Large Language Models as General Pattern Machines
Conference on Robot Learning (CoRL), 2023
Suvir Mirchandani
F. Xia
Peter R. Florence
Brian Ichter
Danny Driess
Montse Gonzalez Arenas
Kanishka Rao
Dorsa Sadigh
Andy Zeng
LLMAG
256
251
0
10 Jul 2023
Supervised Pretraining Can Learn In-Context Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
Jonathan Lee
Annie Xie
Aldo Pacchiano
Yash Chandak
Chelsea Finn
Ofir Nachum
Emma Brunskill
OffRL
292
116
0
26 Jun 2023
Large Language Models are Few-Shot Health Learners
Xin Liu
Daniel J. McDuff
G. Kovács
I. Galatzer-Levy
Jacob Sunshine
Jiening Zhan
M. Poh
Shun Liao
P. Achille
Shwetak N. Patel
LM&MA
AI4MH
274
143
0
24 May 2023
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Sina J. Semnani
Violet Z. Yao
He Zhang
M. Lam
KELM
AI4MH
259
100
0
23 May 2023
Structured State Space Models for In-Context Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
Chris Xiaoxuan Lu
Yannick Schroecker
Albert Gu
Emilio Parisotto
Jakob N. Foerster
Satinder Singh
Feryal M. P. Behbahani
AI4TS
441
127
0
07 Mar 2023
Multi-task Representation Learning for Pure Exploration in Linear Bandits
International Conference on Machine Learning (ICML), 2023
Yihan Du
Longbo Huang
Wen Sun
323
4
0
09 Feb 2023
Transformers as Algorithms: Generalization and Stability in In-context Learning
International Conference on Machine Learning (ICML), 2023
Yingcong Li
M. E. Ildiz
Dimitris Papailiopoulos
Samet Oymak
243
217
0
17 Jan 2023
Optimal Algorithms for Latent Bandits with Cluster Structure
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
S. Pal
A. Suggala
Karthikeyan Shanmugam
Prateek Jain
314
12
0
17 Jan 2023
RT-1: Robotics Transformer for Real-World Control at Scale
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Joseph Dabis
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
423
1,646
0
13 Dec 2022
Multi-Task Off-Policy Learning from Bandit Feedback
International Conference on Machine Learning (ICML), 2022
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
OffRL
257
11
0
09 Dec 2022
Learning Options via Compression
Neural Information Processing Systems (NeurIPS), 2022
Yiding Jiang
Emmy Liu
Benjamin Eysenbach
Zico Kolter
Chelsea Finn
OffRL
245
20
0
08 Dec 2022
In-context Reinforcement Learning with Algorithm Distillation
International Conference on Learning Representations (ICLR), 2022
Michael Laskin
Luyu Wang
Junhyuk Oh
Emilio Parisotto
Stephen Spencer
...
Ethan A. Brooks
Maxime Gazeau
Himanshu Sahni
Satinder Singh
Volodymyr Mnih
OffRL
169
165
0
25 Oct 2022
Dichotomy of Control: Separating What You Can Control from What You Cannot
International Conference on Learning Representations (ICLR), 2022
Mengjiao Yang
Dale Schuurmans
Pieter Abbeel
Ofir Nachum
OffRL
178
45
0
24 Oct 2022
Tractable Optimality in Episodic Latent MABs
Neural Information Processing Systems (NeurIPS), 2022
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
238
3
0
05 Oct 2022
Partially Observable Markov Decision Processes in Robotics: A Survey
IEEE Transactions on robotics (TRO), 2022
M. Lauri
David Hsu
Joni Pajarinen
293
153
0
21 Sep 2022
On The Computational Complexity of Self-Attention
International Conference on Algorithmic Learning Theory (ALT), 2022
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
Chinmay Hegde
261
211
0
11 Sep 2022
Behavior Transformers: Cloning
k
k
k
modes with one stone
Neural Information Processing Systems (NeurIPS), 2022
Nur Muhammad (Mahi) Shafiullah
Zichen Jeff Cui
Ariuntuya Altanzaya
Lerrel Pinto
OffRL
253
325
0
22 Jun 2022
When does return-conditioned supervised learning work for offline reinforcement learning?
Neural Information Processing Systems (NeurIPS), 2022
David Brandfonbrener
A. Bietti
Jacob Buckman
Romain Laroche
Joan Bruna
OffRL
193
80
0
02 Jun 2022
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters
Neural Information Processing Systems (NeurIPS), 2022
Seyed Kamyar Seyed Ghasemipour
S. Gu
Ofir Nachum
OffRL
179
85
0
27 May 2022
A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Shangding Gu
Longyu Yang
Yali Du
Guang Chen
Florian Walter
Jun Wang
Alois C. Knoll
OffRL
AI4TS
435
300
0
20 May 2022
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
367
958
0
12 May 2022
Few-shot learning for medical text: A systematic review
Yao Ge
Yuting Guo
Yuan-Chi Yang
M. Al-garadi
A. Sarker
149
21
0
21 Apr 2022
Nearly Minimax Algorithms for Linear Bandits with Shared Representation
Jiaqi Yang
Qi Lei
Jason D. Lee
S. Du
205
16
0
29 Mar 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAG
LRM
428
1,764
0
25 Feb 2022
Deep Hierarchy in Bandits
International Conference on Machine Learning (ICML), 2022
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
162
20
0
03 Feb 2022
Transformers Can Do Bayesian Inference
International Conference on Learning Representations (ICLR), 2021
Samuel G. Müller
Noah Hollmann
Sebastian Pineda Arango
Josif Grabocka
Katharina Eggensperger
BDL
UQCV
743
229
0
20 Dec 2021
Hierarchical Bayesian Bandits
International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Joey Hong
Branislav Kveton
Manzil Zaheer
Mohammad Ghavamzadeh
FedML
285
42
0
12 Nov 2021
An Explanation of In-context Learning as Implicit Bayesian Inference
International Conference on Learning Representations (ICLR), 2021
Sang Michael Xie
Aditi Raghunathan
Abigail Z. Jacobs
Tengyu Ma
ReLM
BDL
VPVLM
LRM
903
906
0
03 Nov 2021
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems
Andrea Madotto
Mohammad Kachuee
Genta Indra Winata
Pascale Fung
167
91
0
15 Oct 2021
Offline Meta-Reinforcement Learning with Online Self-Supervision
International Conference on Machine Learning (ICML), 2021
Vitchyr H. Pong
Ashvin Nair
Laura M. Smith
Catherine Huang
Sergey Levine
OffRL
320
75
0
08 Jul 2021
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Neural Information Processing Systems (NeurIPS), 2021
Michael Janner
Qiyang Li
Sergey Levine
OffRL
541
776
0
03 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Neural Information Processing Systems (NeurIPS), 2021
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
495
1,950
0
02 Jun 2021
COMBO: Conservative Offline Model-Based Policy Optimization
Neural Information Processing Systems (NeurIPS), 2021
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
519
469
0
16 Feb 2021
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature
Neural Information Processing Systems (NeurIPS), 2021
Kefan Dong
Jiaqi Yang
Tengyu Ma
419
37
0
08 Feb 2021
Impact of Representation Learning in Linear Bandits
Jiaqi Yang
Wei Hu
Jason D. Lee
S. Du
244
55
0
13 Oct 2020
Offline Meta-Reinforcement Learning with Advantage Weighting
E. Mitchell
Rafael Rafailov
Xue Bin Peng
Sergey Levine
Chelsea Finn
OffRL
324
113
0
13 Aug 2020
Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices
International Conference on Machine Learning (ICML), 2020
Emmy Liu
Aditi Raghunathan
Abigail Z. Jacobs
Chelsea Finn
OffRL
480
74
0
06 Aug 2020
1
2
Next