v1v2v3 (latest)

Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning

7 June 2024

ArXiv (abs)PDF HTML Github

Papers citing "Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning"

50 / 91 papers shown

In-Context Curiosity: Distilling Exploration for Decision-Pretrained Transformers on Bandit Tasks

Huitao Yang

Guanting Chen

OffRL

144

30 Sep 2025

In-Context Compositional Q-Learning for Offline Reinforcement Learning

186

28 Sep 2025

HVAC-DPT: A Decision Pretrained Transformer for HVAC Control

Anaïs Berkes

AI4CE

392

29 Nov 2024

Efficient Frameworks for Generalized Low-Rank Matrix Bandit ProblemsNeural Information Processing Systems (NeurIPS), 2024

Yue Kang

Cho-Jui Hsieh

T. C. Lee

399

14 Jan 2024

Self-supervised Pretraining for Decision Foundation Model: Formulation, Pipeline and Challenges

376

29 Dec 2023

In-Context Reinforcement Learning for Variable Action Spaces

898

20 Dec 2023

Multi-task Representation Learning for Pure Exploration in Bilinear BanditsNeural Information Processing Systems (NeurIPS), 2023

501

01 Nov 2023

Rethinking Decision Transformer via Hierarchical Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023

Jianye Hao

330

01 Nov 2023

Transformers are Provably Optimal In-context Estimators for Wireless CommunicationsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

Vishnu Teja Kunde

Vicram Rajagopalan

Chandra Shekhara Kaushik Valmeekam

725

01 Nov 2023

Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised PretrainingInternational Conference on Learning Representations (ICLR), 2023

389

12 Oct 2023

Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency

575

29 Sep 2023

Bayesian Low-rank Adaptation for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

933

105

24 Aug 2023

ExpeL: LLM Agents Are Experiential LearnersAAAI Conference on Artificial Intelligence (AAAI), 2023

Gao Huang

550

473

20 Aug 2023

Large Language Models as General Pattern MachinesConference on Robot Learning (CoRL), 2023

Montse Gonzalez Arenas

Kanishka Rao

Dorsa Sadigh

Andy Zeng

LLMAG

384

274

10 Jul 2023

Supervised Pretraining Can Learn In-Context Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023

347

139

26 Jun 2023

Large Language Models are Few-Shot Health Learners

484

161

24 May 2023

WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on WikipediaConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

461

105

23 May 2023

Structured State Space Models for In-Context Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023

Feryal M. P. Behbahani

AI4TS

656

139

07 Mar 2023

Multi-task Representation Learning for Pure Exploration in Linear BanditsInternational Conference on Machine Learning (ICML), 2023

Yihan Du

Longbo Huang

Wen Sun

415

09 Feb 2023

Transformers as Algorithms: Generalization and Stability in In-context LearningInternational Conference on Machine Learning (ICML), 2023

Yingcong Li

M. E. Ildiz

Dimitris Papailiopoulos

Samet Oymak

432

239

17 Jan 2023

Optimal Algorithms for Latent Bandits with Cluster StructureInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

S. Pal

A. Suggala

Karthikeyan Shanmugam

Prateek Jain

470

17 Jan 2023

RT-1: Robotics Transformer for Real-World Control at Scale

...

685

2,067

13 Dec 2022

Multi-Task Off-Policy Learning from Bandit FeedbackInternational Conference on Machine Learning (ICML), 2022

490

09 Dec 2022

Learning Options via CompressionNeural Information Processing Systems (NeurIPS), 2022

395

08 Dec 2022

In-context Reinforcement Learning with Algorithm DistillationInternational Conference on Learning Representations (ICLR), 2022

Stephen Spencer

...

351

188

25 Oct 2022

Dichotomy of Control: Separating What You Can Control from What You CannotInternational Conference on Learning Representations (ICLR), 2022

Pieter Abbeel

266

24 Oct 2022

Tractable Optimality in Episodic Latent MABsNeural Information Processing Systems (NeurIPS), 2022

Jeongyeol Kwon

Yonathan Efroni

Constantine Caramanis

Shie Mannor

323

05 Oct 2022

Partially Observable Markov Decision Processes in Robotics: A SurveyIEEE Transactions on robotics (TRO), 2022

M. Lauri

David Hsu

Joni Pajarinen

466

198

21 Sep 2022

On The Computational Complexity of Self-AttentionInternational Conference on Algorithmic Learning Theory (ALT), 2022

Feyza Duman Keles

Pruthuvi Maheshakya Wijewardena

Chinmay Hegde

395

262

11 Sep 2022

Behavior Transformers: Cloning

k

modes with one stoneNeural Information Processing Systems (NeurIPS), 2022

Nur Muhammad (Mahi) Shafiullah

466

349

22 Jun 2022

When does return-conditioned supervised learning work for offline reinforcement learning?Neural Information Processing Systems (NeurIPS), 2022

David Brandfonbrener

Joan Bruna

356

02 Jun 2022

Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence MattersNeural Information Processing Systems (NeurIPS), 2022

Seyed Kamyar Seyed Ghasemipour

S. Gu

Ofir Nachum

OffRL

288

100

27 May 2022

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

Guang Chen

Jun Wang

683

318

20 May 2022

Sergio Gomez Colmenarejo

...

686

1,038

12 May 2022

Few-shot learning for medical text: A systematic review

233

21 Apr 2022

Nearly Minimax Algorithms for Linear Bandits with Shared Representation

349

29 Mar 2022

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Luke Zettlemoyer

680

1,949

25 Feb 2022

Deep Hierarchy in BanditsInternational Conference on Machine Learning (ICML), 2022

292

03 Feb 2022

Transformers Can Do Bayesian InferenceInternational Conference on Learning Representations (ICLR), 2021

Samuel G. Müller

Noah Hollmann

Sebastian Pineda Arango

Josif Grabocka

Katharina Eggensperger

BDL UQCV

1.2K

280

20 Dec 2021

Hierarchical Bayesian BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021

369

12 Nov 2021

An Explanation of In-context Learning as Implicit Bayesian InferenceInternational Conference on Learning Representations (ICLR), 2021

1.2K

1,017

03 Nov 2021

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Andrea Madotto

Mohammad Kachuee

Genta Indra Winata

Pascale Fung

240

15 Oct 2021

Offline Meta-Reinforcement Learning with Online Self-SupervisionInternational Conference on Machine Learning (ICML), 2021

440

08 Jul 2021

Offline Reinforcement Learning as One Big Sequence Modeling ProblemNeural Information Processing Systems (NeurIPS), 2021

889

832

03 Jun 2021

Decision Transformer: Reinforcement Learning via Sequence ModelingNeural Information Processing Systems (NeurIPS), 2021

Aravind Rajeswaran

Pieter Abbeel

779

2,192

02 Jun 2021

COMBO: Conservative Offline Model-Based Policy OptimizationNeural Information Processing Systems (NeurIPS), 2021

Aravind Rajeswaran

772

510

16 Feb 2021

Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual CurvatureNeural Information Processing Systems (NeurIPS), 2021

Kefan Dong

Jiaqi Yang

Tengyu Ma

690

08 Feb 2021

Impact of Representation Learning in Linear Bandits

341

13 Oct 2020

Offline Meta-Reinforcement Learning with Advantage Weighting

476

122

13 Aug 2020

Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without SacrificesInternational Conference on Machine Learning (ICML), 2020

688

06 Aug 2020