Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1802.09127
Cited By
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling
International Conference on Learning Representations (ICLR), 2018
26 February 2018
C. Riquelme
George Tucker
Jasper Snoek
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling"
50 / 231 papers shown
Learning to Route LLMs from Bandit Feedback: One Policy, Many Trade-offs
Wang Wei
Tiankai Yang
Hongjie Chen
Yue Zhao
Franck Dernoncourt
Ryan Rossi
Hoda Eldardiry
OffRL
91
0
0
08 Oct 2025
Evolutionary Generative Optimization: Towards Fully Data-Driven Evolutionary Optimization via Generative Learning
Kebin Sun
Tao Jiang
Ran Cheng
Yaochu Jin
Kay Chen Tan
Kay Chen Tan
144
0
0
01 Aug 2025
Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
Emile Anand
Sarah Liaw
254
0
0
21 Jul 2025
Measurement-Aligned Sampling for Inverse Problem
Shaorong Zhang
Rob Brekelmans
Yunshu Wu
Greg Ver Steeg
DiffM
271
0
0
13 Jun 2025
Neural Logistic Bandits
Seoungbin Bae
Dabeen Lee
988
1
0
04 May 2025
Exploring Pseudo-Token Approaches in Transformer Neural Processes
Jose Lara-Rangel
Nanze Chen
Fengzhe Zhang
179
1
0
19 Apr 2025
Active Human Feedback Collection via Neural Contextual Dueling Bandits
Arun Verma
Xiaoqiang Lin
Zhongxiang Dai
Daniela Rus
Bryan Kian Hsiang Low
315
2
0
16 Apr 2025
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning
Yexin Li
Pring Wong
Hanfang Zhang
Shuo Chen
Siyuan Qi
OffRL
320
2
0
23 Mar 2025
NeuroSep-CP-LCB: A Deep Learning-based Contextual Multi-armed Bandit Algorithm with Uncertainty Quantification for Early Sepsis Prediction
Anni Zhou
Raheem Beyah
Rishikesan Kamaleswaran
271
1
0
20 Mar 2025
Exploring the Potential of Bilevel Optimization for Calibrating Neural Networks
Irish Conference on Artificial Intelligence and Cognitive Science (AICS), 2025
Gabriele Sanguin
Arjun Pakrashi
Marco Viola
Francesco Rinaldi
365
0
0
17 Mar 2025
Exploiting Concavity Information in Gaussian Process Contextual Bandit Optimization
Kevin Li
Eric Laber
267
0
0
13 Mar 2025
Active Learning for Direct Preference Optimization
Branislav Kveton
Xintong Li
Julian McAuley
Ryan Rossi
Jingbo Shang
Junda Wu
Tong Yu
354
3
0
03 Mar 2025
LNUCB-TA: Linear-nonlinear Hybrid Bandit Learning with Temporal Attention
H. Khosravi
Mohammad Reza Shafie
Ahmed Shoyeb Raihan
Srinjoy Das
I. Imtiaz Ahmed
363
1
0
01 Mar 2025
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
H. Bui
Enrique Mallada
Anqi Liu
1.0K
3
0
08 Nov 2024
PageRank Bandits for Link Prediction
Neural Information Processing Systems (NeurIPS), 2024
Yikun Ban
Jiaru Zou
Zihao Li
Yunzhe Qi
Dongqi Fu
Jian Kang
Hanghang Tong
Jingrui He
329
14
0
03 Nov 2024
Online Posterior Sampling with a Diffusion Prior
Neural Information Processing Systems (NeurIPS), 2024
Branislav Kveton
Boris Oreshkin
Youngsuk Park
Aniket Deshmukh
Rui Song
DiffM
215
4
0
04 Oct 2024
The Digital Transformation in Health: How AI Can Improve the Performance of Health Systems
Health systems and reform (HSR), 2024
África Periánez
Ana Fernández del Río
Ivan Nazarov
Enric Jané
Moiz Hassan
Aditya Rastogi
Dexian Tang
243
19
0
24 Sep 2024
Adaptive User Journeys in Pharma E-Commerce with Reinforcement Learning: Insights from SwipeRx
Ana Fernández del Río
Michael Brennan Leong
Paulo Saraiva
Ivan Nazarov
Aditya Rastogi
Moiz Hassan
Dexian Tang
África Periánez
OffRL
OnRL
216
2
0
15 Aug 2024
Adaptive Behavioral AI: Reinforcement Learning to Enhance Pharmacy Services
Ana Fernández del Río
Michael Brennan Leong
Paulo Saraiva
Ivan Nazarov
Aditya Rastogi
Moiz Hassan
Dexian Tang
África Periánez
OffRL
166
6
0
14 Aug 2024
Optimizing HIV Patient Engagement with Reinforcement Learning in Resource-Limited Settings
África Periánez
Kathrin Schmitz
Lazola Makhupula
Moiz Hassan
Moeti Moleko
Ana Fernández del Río
Ivan Nazarov
Aditya Rastogi
Dexian Tang
OffRL
177
0
0
14 Aug 2024
Meta Clustering of Neural Bandits
Knowledge Discovery and Data Mining (KDD), 2024
Yikun Ban
Yunzhe Qi
Tianxin Wei
Lihui Liu
Jingrui He
297
10
0
10 Aug 2024
AExGym: Benchmarks and Environments for Adaptive Experimentation
Jimmy Wang
Ethan Che
Daniel R. Jiang
Hongseok Namkoong
280
0
0
08 Aug 2024
Bayesian Bandit Algorithms with Approximate Inference in Stochastic Linear Bandits
Ziyi Huang
Henry Lam
Haofeng Zhang
387
1
0
20 Jun 2024
Graph Neural Thompson Sampling
Shuang Wu
Arash A. Amini
363
1
0
15 Jun 2024
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
Subhojyoti Mukherjee
Josiah P. Hanna
Qiaomin Xie
Robert D. Nowak
635
6
0
07 Jun 2024
A Bayesian Approach to Online Planning
Nir Greshler
David Ben-Eli
Carmel Rabinovitz
Gabi Guetta
Liran Gispan
Guy Zohar
Aviv Tamar
173
1
0
04 Jun 2024
Position: Why We Must Rethink Empirical Research in Machine Learning
International Conference on Machine Learning (ICML), 2024
Moritz Herrmann
F. J. D. Lange
Katharina Eggensperger
Giuseppe Casalicchio
Marcel Wever
Matthias Feurer
David Rügamer
Eyke Hüllermeier
A. Boulesteix
B. Bischl
253
23
0
03 May 2024
Online Personalizing White-box LLMs Generation with Neural Bandits
Zekai Chen
Weeden Daniel
Po-yu Chen
Francois Buet-Golfouse
159
4
0
24 Apr 2024
Uncertainty in Language Models: Assessment through Rank-Calibration
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Xinmeng Huang
Shuo Li
Mengxin Yu
Matteo Sesia
Hamed Hassani
Insup Lee
Osbert Bastani
Guang Cheng
240
32
0
04 Apr 2024
On the Importance of Uncertainty in Decision-Making with Large Language Models
Nicolò Felicioni
Lucas Maystre
Sina Ghiassian
K. Ciosek
LLMAG
342
6
0
03 Apr 2024
Better than classical? The subtle art of benchmarking quantum machine learning models
Joseph Bowles
Shahnawaz Ahmed
Maria Schuld
359
124
0
11 Mar 2024
ε-Neural Thompson Sampling of Deep Brain Stimulation for Parkinson Disease Treatment
International Conference on Cyber-Physical Systems (ICCPS), 2024
Hao-Lun Hsu
Qitong Gao
Miroslav Pajic
273
0
0
11 Mar 2024
Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Xiaoying Zhang
Jean-François Ton
Wei Shen
Hongning Wang
Yang Liu
168
19
0
08 Mar 2024
Watch Your Head: Assembling Projection Heads to Save the Reliability of Federated Models
Jinqian Chen
Jihua Zhu
Qinghai Zheng
Zhongyu Li
Zhiqiang Tian
FedML
229
3
0
26 Feb 2024
Diffusion Models Meet Contextual Bandits
Imad Aouali
DiffM
294
5
0
15 Feb 2024
Predictive Churn with the Set of Good Models
J. Watson-Daniels
Flavio du Pin Calmon
Alexander DÁmour
Carol Xuan Long
David C. Parkes
Berk Ustun
317
11
0
12 Feb 2024
LiRank: Industrial Large Scale Ranking Models at LinkedIn
Knowledge Discovery and Data Mining (KDD), 2024
Fedor Borisyuk
Mingzhou Zhou
Qingquan Song
Sirou Zhu
B. Tiwana
...
Chen-Chen Jiang
Haichao Wei
Maneesh Varshney
Amol Ghoting
Souvik Ghosh
189
10
0
10 Feb 2024
Efficient Exploration for LLMs
Vikranth Dwaracherla
S. Asghari
Botao Hao
Benjamin Van Roy
LLMAG
419
35
0
01 Feb 2024
Improving sample efficiency of high dimensional Bayesian optimization with MCMC
Zeji Yi
Yunyue Wei
Chu Xin Cheng
Kaibo He
Yanan Sui
207
8
0
05 Jan 2024
A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration
Fahri Wisnu Murti
Samad Ali
Matti Latva-aho
225
1
0
26 Dec 2023
Risk-Aware Continuous Control with Neural Contextual Bandits
AAAI Conference on Artificial Intelligence (AAAI), 2023
J. Ayala-Romero
A. Garcia-Saavedra
Xavier Pérez Costa
179
4
0
15 Dec 2023
RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health Interventions
Easton K. Huch
Jieru Shi
Madeline R Abbott
J. Golbus
Alexander Moreno
Walter Dempsey
OffRL
375
0
0
11 Dec 2023
Bootstrap Your Own Variance
Polina Turishcheva
Jason Ramapuram
Sinead Williamson
Dan Busbridge
Eeshan Gunesh Dhekane
Russ Webb
UQCV
212
1
0
06 Dec 2023
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling
International Conference on Machine Learning (ICML), 2023
Bairu Hou
Yujian Liu
Kaizhi Qian
Jacob Andreas
Shiyu Chang
Yang Zhang
UD
UQCV
PER
335
94
0
15 Nov 2023
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation
Neural Information Processing Systems (NeurIPS), 2023
Nikki Lijing Kuang
Ming Yin
Mengdi Wang
Yu Wang
Yian Ma
304
6
0
29 Oct 2023
Towards a Pretrained Model for Restless Bandits via Multi-arm Generalization
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Yunfan Zhao
Nikhil Behari
Edward Hughes
Edwin Zhang
Dheeraj M. Nagaraj
K. Tuyls
Aparna Taneja
Milind Tambe
279
10
0
23 Oct 2023
Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Zheqing Zhu
Yueyang Liu
Xu Kuang
Benjamin Van Roy
AI4TS
239
1
0
11 Oct 2023
Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data
S.M.F. Sani
Seyed Abbas Hosseini
Hamid R. Rabiee
OffRL
242
1
0
07 Oct 2023
Multi-fidelity climate model parameterization for better generalization and extrapolation
Mohamed Aziz Bhouri
Liran Peng
Michael S. Pritchard
Pierre Gentine
AI4CE
250
7
0
19 Sep 2023
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Jiuhai Chen
Jonas W. Mueller
360
113
0
30 Aug 2023
1
2
3
4
5
Next