ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.00301
  4. Cited By
Warm-starting Contextual Bandits: Robustly Combining Supervised and
  Bandit Feedback
v1v2 (latest)

Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback

2 January 2019
Chicheng Zhang
Alekh Agarwal
Hal Daumé
John Langford
S. Negahban
ArXiv (abs)PDFHTML

Papers citing "Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback"

16 / 16 papers shown
Offline Clustering of Preference Learning with Active-data Augmentation
Offline Clustering of Preference Learning with Active-data Augmentation
Jingyuan Liu
Fatemeh Ghaffari
Xuchuang Wang
Xutong Liu
Mohammad Hajiesmaili
Carlee Joe-Wong
OffRL
279
0
0
30 Oct 2025
Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis
Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis
Ruiquan Huang
Donghao Li
Chengshuai Shi
Cong Shen
Jing Yang
OffRL
505
0
0
01 Jul 2025
Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards
Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards
Wenlong Ji
Yihan Pan
Ruihao Zhu
Lihua Lei
238
6
0
20 Jun 2025
Best Arm Identification with Possibly Biased Offline Data
Best Arm Identification with Possibly Biased Offline DataConference on Uncertainty in Artificial Intelligence (UAI), 2025
Le Yang
Vincent Y. F. Tan
Wang Chi Cheung
202
2
0
29 May 2025
Deconfounded Warm-Start Thompson Sampling with Applications to Precision Medicine
Deconfounded Warm-Start Thompson Sampling with Applications to Precision Medicine
Prateek Jaiswal
Esmaeil Keyvanshokooh
Junyu Cao
312
0
0
22 May 2025
Warm Starting of CMA-ES for Contextual Optimization Problems
Warm Starting of CMA-ES for Contextual Optimization ProblemsParallel Problem Solving from Nature (PPSN), 2025
Yuta Sekino
Kento Uchida
Shinichi Shirakawa
350
1
0
18 Feb 2025
MBExplainer: Multilevel bandit-based explanations for downstream models
  with augmented graph embeddings
MBExplainer: Multilevel bandit-based explanations for downstream models with augmented graph embeddings
Ashkan Golgoon
Ryan Franks
Khashayar Filom
Arjun Ravi Kannan
386
0
0
01 Nov 2024
Jump Starting Bandits with LLM-Generated Prior Knowledge
Jump Starting Bandits with LLM-Generated Prior Knowledge
P. A. Alamdari
Yanshuai Cao
Kevin H. Wilson
251
11
0
27 Jun 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
803
4
0
13 Jun 2024
Leveraging User-Triggered Supervision in Contextual Bandits
Leveraging User-Triggered Supervision in Contextual Bandits
Alekh Agarwal
Claudio Gentile
T. V. Marinov
207
0
0
07 Feb 2023
Leveraging Demonstrations to Improve Online Learning: Quality Matters
Leveraging Demonstrations to Improve Online Learning: Quality MattersInternational Conference on Machine Learning (ICML), 2023
Botao Hao
Rahul Jain
Tor Lattimore
Benjamin Van Roy
Zheng Wen
604
13
0
07 Feb 2023
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Siddhartha Banerjee
Sean R. Sinclair
Milind Tambe
Lily Xu
Chao Yu
AI4TS
834
10
0
30 Sep 2022
Thompson Sampling for Robust Transfer in Multi-Task Bandits
Thompson Sampling for Robust Transfer in Multi-Task BanditsInternational Conference on Machine Learning (ICML), 2022
Zhi Wang
Chicheng Zhang
Kamalika Chaudhuri
AAML
326
7
0
17 Jun 2022
Multitask Bandit Learning Through Heterogeneous Feedback Aggregation
Multitask Bandit Learning Through Heterogeneous Feedback AggregationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Zhi Wang
Chicheng Zhang
Manish Singh
L. Riek
Kamalika Chaudhuri
480
26
0
29 Oct 2020
DBA bandits: Self-driving index tuning under ad-hoc, analytical
  workloads with safety guarantees
DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guaranteesIEEE International Conference on Data Engineering (ICDE), 2020
R. Perera
Bastian Oetomo
Benjamin I. P. Rubinstein
Renata Borovica-Gajic
199
41
0
19 Oct 2020
Combining Offline Causal Inference and Online Bandit Learning for Data
  Driven Decision
Combining Offline Causal Inference and Online Bandit Learning for Data Driven Decision
Li Ye
Yishi Lin
Hong Xie
John C. S. Lui
CML
260
12
0
16 Jan 2020
1
Page 1 of 1