ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.05064
  4. Cited By
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
v1v2v3 (latest)

Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning

7 June 2024
Subhojyoti Mukherjee
Josiah P. Hanna
Qiaomin Xie
Robert D. Nowak
ArXiv (abs)PDFHTML

Papers citing "Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning"

50 / 91 papers shown
Title
In-Context Curiosity: Distilling Exploration for Decision-Pretrained Transformers on Bandit Tasks
In-Context Curiosity: Distilling Exploration for Decision-Pretrained Transformers on Bandit Tasks
Huitao Yang
Guanting Chen
OffRL
48
0
0
30 Sep 2025
In-Context Compositional Q-Learning for Offline Reinforcement Learning
In-Context Compositional Q-Learning for Offline Reinforcement Learning
Qiushui Xu
Yuhao Huang
Yushu Jiang
Lei Song
Jinyu Wang
Wenliang Zheng
Jiang Bian
OffRL
84
0
0
28 Sep 2025
HVAC-DPT: A Decision Pretrained Transformer for HVAC Control
HVAC-DPT: A Decision Pretrained Transformer for HVAC Control
Anaïs Berkes
AI4CE
319
1
0
29 Nov 2024
Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems
Efficient Frameworks for Generalized Low-Rank Matrix Bandit ProblemsNeural Information Processing Systems (NeurIPS), 2024
Yue Kang
Cho-Jui Hsieh
T. C. Lee
283
22
0
14 Jan 2024
Self-supervised Pretraining for Decision Foundation Model: Formulation,
  Pipeline and Challenges
Self-supervised Pretraining for Decision Foundation Model: Formulation, Pipeline and Challenges
Xiaoqian Liu
Jianbin Jiao
Junge Zhang
OffRLLRM
283
2
0
29 Dec 2023
In-Context Reinforcement Learning for Variable Action Spaces
In-Context Reinforcement Learning for Variable Action Spaces
Viacheslav Sinii
Alexander Nikulin
Vladislav Kurenkov
Ilya Zisman
Sergey Kolesnikov
629
22
0
20 Dec 2023
Multi-task Representation Learning for Pure Exploration in Bilinear
  Bandits
Multi-task Representation Learning for Pure Exploration in Bilinear BanditsNeural Information Processing Systems (NeurIPS), 2023
Subhojyoti Mukherjee
Qiaomin Xie
Josiah P. Hanna
Robert D. Nowak
329
6
0
01 Nov 2023
Rethinking Decision Transformer via Hierarchical Reinforcement Learning
Rethinking Decision Transformer via Hierarchical Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023
Yi-An Ma
Chenjun Xiao
Hebin Liang
Jianye Hao
OffRL
170
13
0
01 Nov 2023
Transformers are Provably Optimal In-context Estimators for Wireless Communications
Transformers are Provably Optimal In-context Estimators for Wireless CommunicationsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Vishnu Teja Kunde
Vicram Rajagopalan
Chandra Shekhara Kaushik Valmeekam
Krishna R. Narayanan
S. Shakkottai
D. Kalathil
J. Chamberland
488
10
0
01 Nov 2023
Transformers as Decision Makers: Provable In-Context Reinforcement
  Learning via Supervised Pretraining
Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised PretrainingInternational Conference on Learning Representations (ICLR), 2023
Licong Lin
Yu Bai
Song Mei
OffRL
265
66
0
12 Oct 2023
Reason for Future, Act for Now: A Principled Framework for Autonomous
  LLM Agents with Provable Sample Efficiency
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
Zhihan Liu
Hao Hu
Shenao Zhang
Hongyi Guo
Shuqi Ke
Boyi Liu
Zhaoran Wang
LLMAGLRM
384
44
0
29 Sep 2023
Bayesian Low-rank Adaptation for Large Language Models
Bayesian Low-rank Adaptation for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Adam X. Yang
Maxime Robeyns
Xi Wang
Laurence Aitchison
AI4CEBDL
591
80
0
24 Aug 2023
ExpeL: LLM Agents Are Experiential Learners
ExpeL: LLM Agents Are Experiential LearnersAAAI Conference on Artificial Intelligence (AAAI), 2023
Andrew Zhao
Daniel Huang
Quentin Xu
Matthieu Lin
Wenshu Fan
Gao Huang
LLMAG
387
325
0
20 Aug 2023
Large Language Models as General Pattern Machines
Large Language Models as General Pattern MachinesConference on Robot Learning (CoRL), 2023
Suvir Mirchandani
F. Xia
Peter R. Florence
Brian Ichter
Danny Driess
Montse Gonzalez Arenas
Kanishka Rao
Dorsa Sadigh
Andy Zeng
LLMAG
256
251
0
10 Jul 2023
Supervised Pretraining Can Learn In-Context Reinforcement Learning
Supervised Pretraining Can Learn In-Context Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Jonathan Lee
Annie Xie
Aldo Pacchiano
Yash Chandak
Chelsea Finn
Ofir Nachum
Emma Brunskill
OffRL
292
116
0
26 Jun 2023
Large Language Models are Few-Shot Health Learners
Large Language Models are Few-Shot Health Learners
Xin Liu
Daniel J. McDuff
G. Kovács
I. Galatzer-Levy
Jacob Sunshine
Jiening Zhan
M. Poh
Shun Liao
P. Achille
Shwetak N. Patel
LM&MAAI4MH
274
143
0
24 May 2023
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by
  Few-Shot Grounding on Wikipedia
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on WikipediaConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Sina J. Semnani
Violet Z. Yao
He Zhang
M. Lam
KELMAI4MH
259
100
0
23 May 2023
Structured State Space Models for In-Context Reinforcement Learning
Structured State Space Models for In-Context Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Chris Xiaoxuan Lu
Yannick Schroecker
Albert Gu
Emilio Parisotto
Jakob N. Foerster
Satinder Singh
Feryal M. P. Behbahani
AI4TS
441
127
0
07 Mar 2023
Multi-task Representation Learning for Pure Exploration in Linear
  Bandits
Multi-task Representation Learning for Pure Exploration in Linear BanditsInternational Conference on Machine Learning (ICML), 2023
Yihan Du
Longbo Huang
Wen Sun
323
4
0
09 Feb 2023
Transformers as Algorithms: Generalization and Stability in In-context
  Learning
Transformers as Algorithms: Generalization and Stability in In-context LearningInternational Conference on Machine Learning (ICML), 2023
Yingcong Li
M. E. Ildiz
Dimitris Papailiopoulos
Samet Oymak
243
217
0
17 Jan 2023
Optimal Algorithms for Latent Bandits with Cluster Structure
Optimal Algorithms for Latent Bandits with Cluster StructureInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
S. Pal
A. Suggala
Karthikeyan Shanmugam
Prateek Jain
314
12
0
17 Jan 2023
RT-1: Robotics Transformer for Real-World Control at Scale
RT-1: Robotics Transformer for Real-World Control at Scale
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Joseph Dabis
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
423
1,646
0
13 Dec 2022
Multi-Task Off-Policy Learning from Bandit Feedback
Multi-Task Off-Policy Learning from Bandit FeedbackInternational Conference on Machine Learning (ICML), 2022
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
OffRL
257
11
0
09 Dec 2022
Learning Options via Compression
Learning Options via CompressionNeural Information Processing Systems (NeurIPS), 2022
Yiding Jiang
Emmy Liu
Benjamin Eysenbach
Zico Kolter
Chelsea Finn
OffRL
245
20
0
08 Dec 2022
In-context Reinforcement Learning with Algorithm Distillation
In-context Reinforcement Learning with Algorithm DistillationInternational Conference on Learning Representations (ICLR), 2022
Michael Laskin
Luyu Wang
Junhyuk Oh
Emilio Parisotto
Stephen Spencer
...
Ethan A. Brooks
Maxime Gazeau
Himanshu Sahni
Satinder Singh
Volodymyr Mnih
OffRL
169
165
0
25 Oct 2022
Dichotomy of Control: Separating What You Can Control from What You
  Cannot
Dichotomy of Control: Separating What You Can Control from What You CannotInternational Conference on Learning Representations (ICLR), 2022
Mengjiao Yang
Dale Schuurmans
Pieter Abbeel
Ofir Nachum
OffRL
178
45
0
24 Oct 2022
Tractable Optimality in Episodic Latent MABs
Tractable Optimality in Episodic Latent MABsNeural Information Processing Systems (NeurIPS), 2022
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
238
3
0
05 Oct 2022
Partially Observable Markov Decision Processes in Robotics: A Survey
Partially Observable Markov Decision Processes in Robotics: A SurveyIEEE Transactions on robotics (TRO), 2022
M. Lauri
David Hsu
Joni Pajarinen
293
153
0
21 Sep 2022
On The Computational Complexity of Self-Attention
On The Computational Complexity of Self-AttentionInternational Conference on Algorithmic Learning Theory (ALT), 2022
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
Chinmay Hegde
261
211
0
11 Sep 2022
Behavior Transformers: Cloning $k$ modes with one stone
Behavior Transformers: Cloning kkk modes with one stoneNeural Information Processing Systems (NeurIPS), 2022
Nur Muhammad (Mahi) Shafiullah
Zichen Jeff Cui
Ariuntuya Altanzaya
Lerrel Pinto
OffRL
253
325
0
22 Jun 2022
When does return-conditioned supervised learning work for offline
  reinforcement learning?
When does return-conditioned supervised learning work for offline reinforcement learning?Neural Information Processing Systems (NeurIPS), 2022
David Brandfonbrener
A. Bietti
Jacob Buckman
Romain Laroche
Joan Bruna
OffRL
193
80
0
02 Jun 2022
Why So Pessimistic? Estimating Uncertainties for Offline RL through
  Ensembles, and Why Their Independence Matters
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence MattersNeural Information Processing Systems (NeurIPS), 2022
Seyed Kamyar Seyed Ghasemipour
S. Gu
Ofir Nachum
OffRL
179
85
0
27 May 2022
A Review of Safe Reinforcement Learning: Methods, Theory and
  Applications
A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Shangding Gu
Longyu Yang
Yali Du
Guang Chen
Florian Walter
Jun Wang
Alois C. Knoll
OffRLAI4TS
435
300
0
20 May 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&RoLLMAGAI4CE
367
958
0
12 May 2022
Few-shot learning for medical text: A systematic review
Few-shot learning for medical text: A systematic review
Yao Ge
Yuting Guo
Yuan-Chi Yang
M. Al-garadi
A. Sarker
149
21
0
21 Apr 2022
Nearly Minimax Algorithms for Linear Bandits with Shared Representation
Nearly Minimax Algorithms for Linear Bandits with Shared Representation
Jiaqi Yang
Qi Lei
Jason D. Lee
S. Du
205
16
0
29 Mar 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning
  Work?
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAGLRM
428
1,764
0
25 Feb 2022
Deep Hierarchy in Bandits
Deep Hierarchy in BanditsInternational Conference on Machine Learning (ICML), 2022
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
162
20
0
03 Feb 2022
Transformers Can Do Bayesian Inference
Transformers Can Do Bayesian InferenceInternational Conference on Learning Representations (ICLR), 2021
Samuel G. Müller
Noah Hollmann
Sebastian Pineda Arango
Josif Grabocka
Katharina Eggensperger
BDLUQCV
743
229
0
20 Dec 2021
Hierarchical Bayesian Bandits
Hierarchical Bayesian BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Joey Hong
Branislav Kveton
Manzil Zaheer
Mohammad Ghavamzadeh
FedML
285
42
0
12 Nov 2021
An Explanation of In-context Learning as Implicit Bayesian Inference
An Explanation of In-context Learning as Implicit Bayesian InferenceInternational Conference on Learning Representations (ICLR), 2021
Sang Michael Xie
Aditi Raghunathan
Abigail Z. Jacobs
Tengyu Ma
ReLMBDLVPVLMLRM
903
906
0
03 Nov 2021
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems
Andrea Madotto
Mohammad Kachuee
Genta Indra Winata
Pascale Fung
167
91
0
15 Oct 2021
Offline Meta-Reinforcement Learning with Online Self-Supervision
Offline Meta-Reinforcement Learning with Online Self-SupervisionInternational Conference on Machine Learning (ICML), 2021
Vitchyr H. Pong
Ashvin Nair
Laura M. Smith
Catherine Huang
Sergey Levine
OffRL
320
75
0
08 Jul 2021
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Offline Reinforcement Learning as One Big Sequence Modeling ProblemNeural Information Processing Systems (NeurIPS), 2021
Michael Janner
Qiyang Li
Sergey Levine
OffRL
541
776
0
03 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence ModelingNeural Information Processing Systems (NeurIPS), 2021
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
495
1,950
0
02 Jun 2021
COMBO: Conservative Offline Model-Based Policy Optimization
COMBO: Conservative Offline Model-Based Policy OptimizationNeural Information Processing Systems (NeurIPS), 2021
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
519
469
0
16 Feb 2021
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve
  Optimism, Embrace Virtual Curvature
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual CurvatureNeural Information Processing Systems (NeurIPS), 2021
Kefan Dong
Jiaqi Yang
Tengyu Ma
419
37
0
08 Feb 2021
Impact of Representation Learning in Linear Bandits
Impact of Representation Learning in Linear Bandits
Jiaqi Yang
Wei Hu
Jason D. Lee
S. Du
244
55
0
13 Oct 2020
Offline Meta-Reinforcement Learning with Advantage Weighting
Offline Meta-Reinforcement Learning with Advantage Weighting
E. Mitchell
Rafael Rafailov
Xue Bin Peng
Sergey Levine
Chelsea Finn
OffRL
324
113
0
13 Aug 2020
Decoupling Exploration and Exploitation for Meta-Reinforcement Learning
  without Sacrifices
Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without SacrificesInternational Conference on Machine Learning (ICML), 2020
Emmy Liu
Aditi Raghunathan
Abigail Z. Jacobs
Chelsea Finn
OffRL
480
74
0
06 Aug 2020
12
Next