ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.03326
  4. Cited By
A Survey on Contextual Multi-armed Bandits
v1v2 (latest)

A Survey on Contextual Multi-armed Bandits

13 August 2015
Li Zhou
ArXiv (abs)PDFHTML

Papers citing "A Survey on Contextual Multi-armed Bandits"

48 / 48 papers shown
Reward Model Routing in Alignment
Reward Model Routing in Alignment
Xinle Wu
Yao Lu
154
2
0
03 Oct 2025
Latent Preference Bandits
Latent Preference Bandits
Newton Mwai
Emil Carlsson
Fredrik D. Johansson
237
0
0
07 Aug 2025
Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype
Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype
Nikola Tankovic
Robert Sajina
297
0
0
22 May 2025
In-Domain African Languages Translation Using LLMs and Multi-armed Bandits
In-Domain African Languages Translation Using LLMs and Multi-armed Bandits
Pratik Rakesh Singh
Kritarth Prasad
Mohammadi Zaki
Pankaj Wasnik
240
1
0
21 May 2025
Information maximization for a broad variety of multi-armed bandit games
Information maximization for a broad variety of multi-armed bandit games
Alex Barbier-Chebbah
Christian L. Vestergaard
Jean-Baptiste Masson
204
0
0
20 Mar 2025
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits
Duy Nguyen
Archiki Prasad
Elias Stengel-Eskin
Joey Tianyi Zhou
590
6
0
02 Oct 2024
BFTBrain: Adaptive BFT Consensus with Reinforcement Learning
BFTBrain: Adaptive BFT Consensus with Reinforcement LearningSymposium on Networked Systems Design and Implementation (NSDI), 2024
Chenyuan Wu
Haoyun Qin
Mohammad Javad Amiri
B. T. Loo
Dahlia Malkhi
Ryan Marcus
274
6
0
12 Aug 2024
Reciprocal Learning
Reciprocal LearningNeural Information Processing Systems (NeurIPS), 2024
Julian Rodemann
Christoph Jansen
G. Schollmeyer
FedML
344
0
0
12 Aug 2024
Active Inference in Contextual Multi-Armed Bandits for Autonomous Robotic Exploration
Active Inference in Contextual Multi-Armed Bandits for Autonomous Robotic ExplorationIEEE Transactions on robotics (IEEE Trans. Robot.), 2024
Shohei Wakayama
Alberto Candela
Paul Hayne
Nisar R. Ahmed
261
0
0
07 Aug 2024
EdgeLoc: A Communication-Adaptive Parallel System for Real-Time
  Localization in Infrastructure-Assisted Autonomous Driving
EdgeLoc: A Communication-Adaptive Parallel System for Real-Time Localization in Infrastructure-Assisted Autonomous Driving
Boyi Liu
Jingwen Tong
Yufan Zhuang
321
6
0
20 May 2024
Contextual Restless Multi-Armed Bandits with Application to Demand
  Response Decision-Making
Contextual Restless Multi-Armed Bandits with Application to Demand Response Decision-MakingIEEE Conference on Decision and Control (CDC), 2024
Xin Chen
I-Hong Hou
174
8
0
22 Mar 2024
Multi-Excitation Projective Simulation with a Many-Body Physics Inspired Inductive Bias
Multi-Excitation Projective Simulation with a Many-Body Physics Inspired Inductive Bias
Philip A. LeMaitre
Marius Krumm
Hans J. Briegel
AI4CE
335
2
0
15 Feb 2024
Pricing with Contextual Elasticity and Heteroscedastic Valuation
Pricing with Contextual Elasticity and Heteroscedastic Valuation
Jianyu Xu
Yu-Xiang Wang
233
3
0
26 Dec 2023
Design Principles of Robust Multi-Armed Bandit Framework in Video
  Recommendations
Design Principles of Robust Multi-Armed Bandit Framework in Video Recommendations
Belhassen Bayar
Phanideep Gampa
Ainur Yessenalina
Zhen Wen
AAML
147
0
0
24 Sep 2023
Approximate information for efficient exploration-exploitation
  strategies
Approximate information for efficient exploration-exploitation strategiesPhysical Review E (PRE), 2023
A. Barbier–Chebbah
Christian L. Vestergaard
Jean-Baptiste Masson
191
2
0
04 Jul 2023
Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits
Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear BanditsInternational Conference on Learning Representations (ICLR), 2023
Yuwei Luo
Mohsen Bayati
420
2
0
26 Jun 2023
Learning Action Embeddings for Off-Policy Evaluation
Learning Action Embeddings for Off-Policy EvaluationEuropean Conference on Information Retrieval (ECIR), 2023
Matej Cief
Jacek Golebiowski
Philipp Schmidt
Ziawasch Abedjan
Artur Bekasov
OffRL
218
7
0
06 May 2023
Lero: A Learning-to-Rank Query Optimizer
Lero: A Learning-to-Rank Query OptimizerProceedings of the VLDB Endowment (PVLDB), 2023
Rong Zhu
Wei Chen
Bolin Ding
Xingguang Chen
A. Pfadler
Ziniu Wu
Jingren Zhou
334
96
0
14 Feb 2023
Personalized Reward Learning with Interaction-Grounded Learning (IGL)
Personalized Reward Learning with Interaction-Grounded Learning (IGL)International Conference on Learning Representations (ICLR), 2022
Jessica Maghakian
Paul Mineiro
Kishan Panaganti
Mark Rucker
Akanksha Saran
Cheng Tan
256
12
0
28 Nov 2022
AdaChain: A Learned Adaptive Blockchain
AdaChain: A Learned Adaptive BlockchainProceedings of the VLDB Endowment (PVLDB), 2022
Chenyuan Wu
Bhavana Mehta
Mohammad Javad Amiri
Ryan Marcus
B. T. Loo
315
18
0
03 Nov 2022
Sequential Decision Making on Unmatched Data using Bayesian Kernel
  Embeddings
Sequential Decision Making on Unmatched Data using Bayesian Kernel Embeddings
Diego Martinez-Taboada
Dino Sejdinovic
BDL
181
1
0
25 Oct 2022
Reinforcement Learning and Bandits for Speech and Language Processing:
  Tutorial, Review and Outlook
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and OutlookExpert systems with applications (ESWA), 2022
Baihan Lin
OffRLAI4TS
479
30
0
24 Oct 2022
A Scalable Recommendation Engine for New Users and Items
A Scalable Recommendation Engine for New Users and ItemsSocial Science Research Network (SSRN), 2022
Boya Xu
Yiting Deng
C. Mela
313
2
0
06 Sep 2022
Coordinated Attacks against Contextual Bandits: Fundamental Limits and
  Defense Mechanisms
Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense MechanismsInternational Conference on Machine Learning (ICML), 2022
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
AAML
278
6
0
30 Jan 2022
Doubly Robust Interval Estimation for Optimal Policy Evaluation in
  Online Learning
Doubly Robust Interval Estimation for Optimal Policy Evaluation in Online LearningJournal of the American Statistical Association (JASA), 2021
Ye Shen
Hengrui Cai
Rui Song
OffRL
402
6
0
29 Oct 2021
Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models
Metadata-based Multi-Task Bandits with Bayesian Hierarchical ModelsNeural Information Processing Systems (NeurIPS), 2021
Runzhe Wan
Linjuan Ge
Rui Song
324
31
0
13 Aug 2021
Knowledge Infused Policy Gradients with Upper Confidence Bound for
  Relational Bandits
Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits
Kaushik Roy
Tao Gui
Manas Gaur
A. Sheth
OffRL
216
15
0
25 Jun 2021
Dealing with Expert Bias in Collective Decision-Making
Dealing with Expert Bias in Collective Decision-Making
Axel Abels
Tom Lenaerts
V. Trianni
Ann Nowé
425
8
0
25 Jun 2021
Relational Boosted Bandits
Relational Boosted BanditsAAAI Conference on Artificial Intelligence (AAAI), 2020
A. Kakadiya
S. Natarajan
Balaraman Ravindran
178
7
0
16 Dec 2020
Adversarial Linear Contextual Bandits with Graph-Structured Side
  Observations
Adversarial Linear Contextual Bandits with Graph-Structured Side ObservationsAAAI Conference on Artificial Intelligence (AAAI), 2020
Lingda Wang
Bingcong Li
Huozhi Zhou
G. Giannakis
Lav Varshney
Zhizhen Zhao
328
9
0
10 Dec 2020
Asymptotic Randomised Control with applications to bandits
Asymptotic Randomised Control with applications to bandits
Samuel N. Cohen
Tanut Treetanthiploet
409
5
0
14 Oct 2020
Decentralized Learning for Channel Allocation in IoT Networks over
  Unlicensed Bandwidth as a Contextual Multi-player Multi-armed Bandit Game
Decentralized Learning for Channel Allocation in IoT Networks over Unlicensed Bandwidth as a Contextual Multi-player Multi-armed Bandit GameIEEE Transactions on Wireless Communications (TWC), 2020
Wenbo Wang
Amir Leshem
Dusit Niyato
Zhu Han
283
20
0
30 Mar 2020
Contextual Constrained Learning for Dose-Finding Clinical Trials
Contextual Constrained Learning for Dose-Finding Clinical TrialsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Hyun-Suk Lee
Cong Shen
James Jordon
M. Schaar
254
15
0
08 Jan 2020
Adaptive Modulation and Coding based on Reinforcement Learning for 5G
  Networks
Adaptive Modulation and Coding based on Reinforcement Learning for 5G Networks
Mateus P. Mota
D. C. Araújo
F. H. C. Neto
A. D. Almeida
F. Cavalcanti
66
59
0
25 Nov 2019
Multi-Armed Bandits with Correlated Arms
Multi-Armed Bandits with Correlated ArmsIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2019
Samarth Gupta
Shreyas Chaudhari
Gauri Joshi
Osman Yağan
498
60
0
06 Nov 2019
Optimising Individual-Treatment-Effect Using Bandits
Optimising Individual-Treatment-Effect Using Bandits
Jeroen Berrevoets
Sam Verboven
Wouter Verbeke
CML
100
3
0
16 Oct 2019
Gittins' theorem under uncertainty
Gittins' theorem under uncertaintyElectronic Journal of Probability (EJP), 2019
Samuel N. Cohen
Tanut Treetanthiploet
258
4
0
12 Jul 2019
Rarely-switching linear bandits: optimization of causal effects for the
  real world
Rarely-switching linear bandits: optimization of causal effects for the real world
B. Lansdell
Sofia Triantafillou
Konrad Paul Kording
288
5
0
30 May 2019
Risk-Averse Explore-Then-Commit Algorithms for Finite-Time Bandits
Risk-Averse Explore-Then-Commit Algorithms for Finite-Time BanditsIEEE Conference on Decision and Control (CDC), 2019
Ali Yekkehkhany
Ebrahim Arian
Mohammad Hajiesmaili
R. Nagi
268
12
0
30 Apr 2019
Correlated Multi-armed Bandits with a Latent Random Source
Correlated Multi-armed Bandits with a Latent Random Source
Samarth Gupta
Gauri Joshi
Osman Yağan
217
22
0
17 Aug 2018
Linear Bandits with Stochastic Delayed Feedback
Linear Bandits with Stochastic Delayed FeedbackInternational Conference on Machine Learning (ICML), 2018
Claire Vernade
Alexandra Carpentier
Tor Lattimore
Giovanni Zappella
Beyza Ermis
M. Brueckner
409
73
0
05 Jul 2018
Multi-Statistic Approximate Bayesian Computation with Multi-Armed
  Bandits
Multi-Statistic Approximate Bayesian Computation with Multi-Armed Bandits
Prashant Singh
Andreas Hellander
260
2
0
22 May 2018
Online Learning: A Comprehensive Survey
Online Learning: A Comprehensive Survey
Guosheng Lin
Doyen Sahoo
Jing Lu
P. Zhao
OffRL
469
785
0
08 Feb 2018
Courtesy as a Means to Coordinate
Courtesy as a Means to Coordinate
Panayiotis Danassis
Boi Faltings
254
1
0
22 Jan 2018
Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits
Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits
Huasen Wu
Xueying Guo
Xin Liu
162
29
0
12 Sep 2017
Latent Contextual Bandits and their Application to Personalized
  Recommendations for New Users
Latent Contextual Bandits and their Application to Personalized Recommendations for New Users
Li Zhou
Emma Brunskill
185
65
0
22 Apr 2016
A Survey of Online Experiment Design with the Stochastic Multi-Armed
  Bandit
A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit
Giuseppe Burtini
Jason L. Loeppky
Ramon Lawrence
419
123
0
02 Oct 2015
Algorithms with Logarithmic or Sublinear Regret for Constrained
  Contextual Bandits
Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits
Huasen Wu
R. Srikant
Xin Liu
Chong Jiang
403
98
0
27 Apr 2015
1
Page 1 of 1