v1v2 (latest)

Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback

2 January 2019

Papers citing "Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback"

16 / 16 papers shown

Offline Clustering of Preference Learning with Active-data Augmentation

279

30 Oct 2025

Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis

505

01 Jul 2025

Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards

238

20 Jun 2025

Best Arm Identification with Possibly Biased Offline DataConference on Uncertainty in Artificial Intelligence (UAI), 2025

Le Yang

Vincent Y. F. Tan

Wang Chi Cheung

202

29 May 2025

Deconfounded Warm-Start Thompson Sampling with Applications to Precision Medicine

Prateek Jaiswal

Esmaeil Keyvanshokooh

Junyu Cao

312

22 May 2025

Warm Starting of CMA-ES for Contextual Optimization ProblemsParallel Problem Solving from Nature (PPSN), 2025

Yuta Sekino

Kento Uchida

Shinichi Shirakawa

350

18 Feb 2025

MBExplainer: Multilevel bandit-based explanations for downstream models with augmented graph embeddings

386

01 Nov 2024

Jump Starting Bandits with LLM-Generated Prior Knowledge

P. A. Alamdari

Yanshuai Cao

Kevin H. Wilson

251

27 Jun 2024

Online Bandit Learning with Offline Preference Data for Improved RLHF

803

13 Jun 2024

Leveraging User-Triggered Supervision in Contextual Bandits

Alekh Agarwal

Claudio Gentile

T. V. Marinov

207

07 Feb 2023

Leveraging Demonstrations to Improve Online Learning: Quality MattersInternational Conference on Machine Learning (ICML), 2023

604

07 Feb 2023

Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits

834

30 Sep 2022

Thompson Sampling for Robust Transfer in Multi-Task BanditsInternational Conference on Machine Learning (ICML), 2022

326

17 Jun 2022

Multitask Bandit Learning Through Heterogeneous Feedback AggregationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020

480

29 Oct 2020

DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guaranteesIEEE International Conference on Data Engineering (ICDE), 2020

R. Perera

Bastian Oetomo

Benjamin I. P. Rubinstein

Renata Borovica-Gajic

199

19 Oct 2020

Combining Offline Causal Inference and Online Bandit Learning for Data Driven Decision

260

16 Jan 2020