v1v2 (latest)

A Survey on Contextual Multi-armed Bandits

13 August 2015

Li Zhou

ArXiv (abs)PDF HTML

Papers citing "A Survey on Contextual Multi-armed Bandits"

48 / 48 papers shown

Reward Model Routing in Alignment

Xinle Wu

Yao Lu

154

03 Oct 2025

Latent Preference Bandits

Newton Mwai

Emil Carlsson

Fredrik D. Johansson

237

07 Aug 2025

Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype

Nikola Tankovic

Robert Sajina

297

22 May 2025

In-Domain African Languages Translation Using LLMs and Multi-armed Bandits

240

21 May 2025

Information maximization for a broad variety of multi-armed bandit games

Alex Barbier-Chebbah

Christian L. Vestergaard

Jean-Baptiste Masson

204

20 Mar 2025

LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits

590

02 Oct 2024

BFTBrain: Adaptive BFT Consensus with Reinforcement LearningSymposium on Networked Systems Design and Implementation (NSDI), 2024

274

12 Aug 2024

Reciprocal LearningNeural Information Processing Systems (NeurIPS), 2024

344

12 Aug 2024

Active Inference in Contextual Multi-Armed Bandits for Autonomous Robotic ExplorationIEEE Transactions on robotics (IEEE Trans. Robot.), 2024

261

07 Aug 2024

EdgeLoc: A Communication-Adaptive Parallel System for Real-Time Localization in Infrastructure-Assisted Autonomous Driving

Boyi Liu

Jingwen Tong

Yufan Zhuang

321

20 May 2024

Contextual Restless Multi-Armed Bandits with Application to Demand Response Decision-MakingIEEE Conference on Decision and Control (CDC), 2024

Xin Chen

I-Hong Hou

174

22 Mar 2024

Multi-Excitation Projective Simulation with a Many-Body Physics Inspired Inductive Bias

335

15 Feb 2024

Pricing with Contextual Elasticity and Heteroscedastic Valuation

Jianyu Xu

Yu-Xiang Wang

233

26 Dec 2023

Design Principles of Robust Multi-Armed Bandit Framework in Video Recommendations

147

24 Sep 2023

Approximate information for efficient exploration-exploitation strategiesPhysical Review E (PRE), 2023

A. Barbier–Chebbah

Christian L. Vestergaard

Jean-Baptiste Masson

191

04 Jul 2023

Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear BanditsInternational Conference on Learning Representations (ICLR), 2023

Yuwei Luo

Mohsen Bayati

420

26 Jun 2023

Learning Action Embeddings for Off-Policy EvaluationEuropean Conference on Information Retrieval (ECIR), 2023

218

06 May 2023

Lero: A Learning-to-Rank Query OptimizerProceedings of the VLDB Endowment (PVLDB), 2023

Jingren Zhou

334

14 Feb 2023

Personalized Reward Learning with Interaction-Grounded Learning (IGL)International Conference on Learning Representations (ICLR), 2022

256

28 Nov 2022

AdaChain: A Learned Adaptive BlockchainProceedings of the VLDB Endowment (PVLDB), 2022

315

03 Nov 2022

Sequential Decision Making on Unmatched Data using Bayesian Kernel Embeddings

Diego Martinez-Taboada

Dino Sejdinovic

BDL

181

25 Oct 2022

Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and OutlookExpert systems with applications (ESWA), 2022

Baihan Lin

OffRL AI4TS

479

24 Oct 2022

A Scalable Recommendation Engine for New Users and ItemsSocial Science Research Network (SSRN), 2022

Boya Xu

Yiting Deng

C. Mela

313

06 Sep 2022

Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense MechanismsInternational Conference on Machine Learning (ICML), 2022

Jeongyeol Kwon

Yonathan Efroni

Constantine Caramanis

Shie Mannor

AAML

278

30 Jan 2022

Doubly Robust Interval Estimation for Optimal Policy Evaluation in Online LearningJournal of the American Statistical Association (JASA), 2021

402

29 Oct 2021

Metadata-based Multi-Task Bandits with Bayesian Hierarchical ModelsNeural Information Processing Systems (NeurIPS), 2021

Runzhe Wan

Linjuan Ge

Rui Song

324

13 Aug 2021

Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits

216

25 Jun 2021

Dealing with Expert Bias in Collective Decision-Making

425

25 Jun 2021

Relational Boosted BanditsAAAI Conference on Artificial Intelligence (AAAI), 2020

A. Kakadiya

S. Natarajan

Balaraman Ravindran

178

16 Dec 2020

Adversarial Linear Contextual Bandits with Graph-Structured Side ObservationsAAAI Conference on Artificial Intelligence (AAAI), 2020

328

10 Dec 2020

Asymptotic Randomised Control with applications to bandits

Samuel N. Cohen

Tanut Treetanthiploet

409

14 Oct 2020

Decentralized Learning for Channel Allocation in IoT Networks over Unlicensed Bandwidth as a Contextual Multi-player Multi-armed Bandit GameIEEE Transactions on Wireless Communications (TWC), 2020

Wenbo Wang

Amir Leshem

Dusit Niyato

Zhu Han

283

30 Mar 2020

Contextual Constrained Learning for Dose-Finding Clinical TrialsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020

254

08 Jan 2020

Adaptive Modulation and Coding based on Reinforcement Learning for 5G Networks

25 Nov 2019

Multi-Armed Bandits with Correlated ArmsIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2019

498

06 Nov 2019

Optimising Individual-Treatment-Effect Using Bandits

100

16 Oct 2019

Gittins' theorem under uncertaintyElectronic Journal of Probability (EJP), 2019

Samuel N. Cohen

Tanut Treetanthiploet

258

12 Jul 2019

Rarely-switching linear bandits: optimization of causal effects for the real world

B. Lansdell

Sofia Triantafillou

Konrad Paul Kording

288

30 May 2019

Risk-Averse Explore-Then-Commit Algorithms for Finite-Time BanditsIEEE Conference on Decision and Control (CDC), 2019

268

30 Apr 2019

Correlated Multi-armed Bandits with a Latent Random Source

Samarth Gupta

Gauri Joshi

Osman Yağan

217

17 Aug 2018

Linear Bandits with Stochastic Delayed FeedbackInternational Conference on Machine Learning (ICML), 2018

409

05 Jul 2018

Multi-Statistic Approximate Bayesian Computation with Multi-Armed Bandits

Prashant Singh

Andreas Hellander

260

22 May 2018

Online Learning: A Comprehensive Survey

469

785

08 Feb 2018

Courtesy as a Means to Coordinate

Panayiotis Danassis

Boi Faltings

254

22 Jan 2018

Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits

Huasen Wu

Xueying Guo

Xin Liu

162

12 Sep 2017

Latent Contextual Bandits and their Application to Personalized Recommendations for New Users

Li Zhou

Emma Brunskill

185

22 Apr 2016

A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit

Giuseppe Burtini

Jason L. Loeppky

Ramon Lawrence

419

123

02 Oct 2015

Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits

403

27 Apr 2015