Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2406.05064
Cited By
v1
v2
v3 (latest)
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
7 June 2024
Subhojyoti Mukherjee
Josiah P. Hanna
Qiaomin Xie
Robert D. Nowak
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning"
41 / 91 papers shown
Title
Gamification of Pure Exploration for Linear Bandits
Rémy Degenne
Pierre Ménard
Xuedong Shang
Michal Valko
258
84
0
02 Jul 2020
Latent Bandits Revisited
Joey Hong
Branislav Kveton
Manzil Zaheer
Yinlam Chow
Amr Ahmed
Craig Boutilier
OffRL
221
52
0
15 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
384
2,181
0
08 Jun 2020
Low-Rank Generalized Linear Bandit Problems
Yangyi Lu
A. Meisami
Ambuj Tewari
304
51
0
04 Jun 2020
Exploration-Exploitation in Constrained MDPs
Yonathan Efroni
Shie Mannor
Matteo Pirotta
223
196
0
04 Mar 2020
Rapidly Personalizing Mobile Health Treatment Policies with Limited Data
Sabina Tomkins
Peng Liao
P. Klasnja
Serena Yeung
Susan Murphy
180
6
0
23 Feb 2020
Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning
International Conference on Learning Representations (ICLR), 2020
Noah Y. Siegel
Jost Tobias Springenberg
Felix Berkenkamp
A. Abdolmaleki
Michael Neunert
Thomas Lampe
Agrim Gupta
Nicolas Heess
Martin Riedmiller
OffRL
215
291
0
19 Feb 2020
Thompson Sampling Algorithms for Mean-Variance Bandits
International Conference on Machine Learning (ICML), 2020
Qiuyu Zhu
Vincent Y. F. Tan
242
51
0
01 Feb 2020
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
417
758
0
26 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Journal of machine learning research (JMLR), 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
1.5K
23,535
0
23 Oct 2019
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning
International Conference on Learning Representations (ICLR), 2019
L. Zintgraf
K. Shiarlis
Maximilian Igl
Sebastian Schulze
Y. Gal
Katja Hofmann
Shimon Whiteson
OffRL
280
303
0
18 Oct 2019
A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning
Nicholas C. Landolfi
G. Thomas
Tengyu Ma
OffRL
224
19
0
11 Jul 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Neural Information Processing Systems (NeurIPS), 2019
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
332
1,174
0
03 Jun 2019
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
International Conference on Machine Learning (ICML), 2019
Lin F. Yang
Mengdi Wang
OffRL
GP
309
301
0
24 May 2019
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
International Conference on Machine Learning (ICML), 2019
Kate Rakelly
Aurick Zhou
Deirdre Quillen
Chelsea Finn
Sergey Levine
OffRL
214
733
0
19 Mar 2019
An Information-Theoretic Approach to Minimax Regret in Partial Monitoring
Annual Conference Computational Learning Theory (COLT), 2019
Tor Lattimore
Csaba Szepesvári
275
72
0
01 Feb 2019
Bilinear Bandits with Low-rank Structure
Kwang-Sung Jun
Rebecca Willett
S. Wright
Robert D. Nowak
374
68
0
08 Jan 2019
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
672
1,828
0
07 Dec 2018
An Introduction to Deep Reinforcement Learning
Vincent François-Lavet
Peter Henderson
Riashat Islam
Marc G. Bellemare
Joelle Pineau
OffRL
AI4CE
344
1,388
0
30 Nov 2018
Predicting the Computational Cost of Deep Learning Models
Daniel Justus
John Brennan
Stephen Bonner
A. Mcgough
89
261
0
28 Nov 2018
ProMP: Proximal Meta-Policy Search
Jonas Rothfuss
Dennis Lee
I. Clavera
Tamim Asfour
Pieter Abbeel
273
217
0
16 Oct 2018
Relational Deep Reinforcement Learning
V. Zambaldi
David Raposo
Adam Santoro
V. Bapst
Yujia Li
...
Victoria Langston
Razvan Pascanu
M. Botvinick
Oriol Vinyals
Peter W. Battaglia
OffRL
318
232
0
05 Jun 2018
Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning
Anusha Nagabandi
I. Clavera
Simin Liu
R. Fearing
Pieter Abbeel
Sergey Levine
Chelsea Finn
452
608
0
30 Mar 2018
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling
International Conference on Learning Representations (ICLR), 2018
C. Riquelme
George Tucker
Jasper Snoek
BDL
237
376
0
26 Feb 2018
Meta-Reinforcement Learning of Structured Exploration Strategies
Abhishek Gupta
Russell Mendonca
YuXuan Liu
Pieter Abbeel
Sergey Levine
OffRL
230
366
0
20 Feb 2018
Stochastic Low-Rank Bandits
Branislav Kveton
Csaba Szepesvári
Anup B. Rao
Zheng Wen
Yasin Abbasi-Yadkori
S. Muthukrishnan
167
42
0
13 Dec 2017
The Implicit Bias of Gradient Descent on Separable Data
Journal of machine learning research (JMLR), 2017
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
834
999
0
27 Oct 2017
Attention Is All You Need
Neural Information Processing Systems (NeurIPS), 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
2.7K
159,090
0
12 Jun 2017
Geometry of Optimization and Implicit Regularization in Deep Learning
Behnam Neyshabur
Ryota Tomioka
Ruslan Salakhutdinov
Nathan Srebro
AI4CE
154
138
0
08 May 2017
On Kernelized Multi-armed Bandits
Sayak Ray Chowdhury
Aditya Gopalan
318
510
0
03 Apr 2017
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
1.6K
13,372
0
09 Mar 2017
Learning to reinforcement learn
Jane X. Wang
Z. Kurth-Nelson
Dhruva Tirumala
Hubert Soyer
Joel Z Leibo
Rémi Munos
Charles Blundell
D. Kumaran
M. Botvinick
OffRL
378
1,039
0
17 Nov 2016
RL
2
^2
2
: Fast Reinforcement Learning via Slow Reinforcement Learning
Yan Duan
John Schulman
Xi Chen
Peter L. Bartlett
Ilya Sutskever
Pieter Abbeel
OffRL
261
1,091
0
09 Nov 2016
Yelp Dataset Challenge: Review Rating Prediction
Nabiha Asghar
140
200
0
17 May 2016
One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors
Justin Fu
Sergey Levine
Pieter Abbeel
OffRL
160
165
0
23 Sep 2015
Lipschitz Bandits: Regret Lower Bounds and Optimal Algorithms
Annual Conference Computational Learning Theory (COLT), 2014
Stefan Magureanu
Richard Combes
Alexandre Proutiere
286
167
0
19 May 2014
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
279
13,183
0
19 Dec 2013
Finite-Time Analysis of Kernelised Contextual Bandits
Conference on Uncertainty in Artificial Intelligence (UAI), 2013
Michal Valko
N. Korda
Rémi Munos
I. Flaounas
N. Cristianini
340
298
0
26 Sep 2013
Linear Bandits in High Dimension and Recommendation Systems
Allerton Conference on Communication, Control, and Computing (Allerton), 2012
Y. Deshpande
Andrea Montanari
OffRL
217
73
0
08 Jan 2013
Lipschitz Bandits without the Lipschitz Constant
International Conference on Algorithmic Learning Theory (ALT), 2011
Sébastien Bubeck
Jean-Michel Poggi
Jia Yuan Yu
342
92
0
25 May 2011
A Contextual-Bandit Approach to Personalized News Article Recommendation
The Web Conference (WWW), 2010
Lihong Li
Wei Chu
John Langford
Robert Schapire
774
3,132
0
28 Feb 2010
Previous
1
2