Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.07272
Cited By
Introduction to Multi-Armed Bandits
15 April 2019
Aleksandrs Slivkins
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Introduction to Multi-Armed Bandits"
50 / 137 papers shown
Title
Counterfactual Multi-player Bandits for Explainable Recommendation Diversification
Yansen Zhang
Bowei He
Xiaokun Zhang
Haolun Wu
Zexu Sun
Chen Ma
145
1
0
27 May 2025
Robust Online Learning with Private Information
Kyohei Okumura
123
0
0
08 May 2025
OptimAI: Optimization from Natural Language Using LLM-Powered AI Agents
Raghav Thind
Youran Sun
Ling Liang
Haizhao Yang
LLMAG
146
0
0
23 Apr 2025
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries
Arnab Maiti
Zhiyuan Fan
Kevin Jamieson
Lillian J. Ratliff
Gabriele Farina
313
0
0
01 Apr 2025
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure
Aleksandrs Slivkins
Yunzong Xu
Shiliang Zuo
352
1
0
06 Mar 2025
A Theoretical Model for Grit in Pursuing Ambitious Ends
Avrim Blum
Emily Diana
Kavya Ravichandran
A. Tolbert
129
0
0
04 Mar 2025
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context
Jianyu Xu
Qiuzhuang Sun
Yang Yang
Huadong Mo
Daoyi Dong
173
0
0
24 Feb 2025
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization
Zishun Yu
Tengyu Xu
Di Jin
Karthik Abinav Sankararaman
Yun He
...
Eryk Helenowski
Chen Zhu
Sinong Wang
Hao Ma
Han Fang
LRM
145
8
0
29 Jan 2025
Fuzzing at Scale: The Untold Story of the Scheduler
Ivica Nikolić
Racchit Jain
142
0
0
28 Jan 2025
Online Joint Assortment-Inventory Optimization under MNL Choices
Yong Liang
Xiaojie Mao
Shiyuan Wang
130
0
0
03 Jan 2025
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit
Junyu Cao
Ruijiang Gao
Esmaeil Keyvanshokooh
150
1
0
18 Oct 2024
AutoPersuade: A Framework for Evaluating and Explaining Persuasive Arguments
Till Raphael Saenger
Musashi Hinck
Justin Grimmer
Brandon M Stewart
106
2
0
11 Oct 2024
Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering
Yuxiang Wang
Jianzhong Qi
Junhao Gan
LMTD
145
3
0
10 Oct 2024
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs
Yu Chen
Jiatai Huang
Yan Dai
Longbo Huang
119
0
0
04 Oct 2024
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback
Arun Verma
Zhongxiang Dai
Xiaoqiang Lin
Patrick Jaillet
K. H. Low
126
5
0
24 Jul 2024
Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality
Antoine Scheid
Aymeric Capitaine
Etienne Boursier
Eric Moulines
Michael I. Jordan
Alain Durmus
115
4
0
28 Jun 2024
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Xutong Liu
Siwei Wang
Jinhang Zuo
Han Zhong
Xuchuang Wang
Zhiyong Wang
Shuai Li
Mohammad Hajiesmaili
J. C. Lui
Wei Chen
150
3
0
03 Jun 2024
Paying to Do Better: Games with Payments between Learning Agents
Y. Kolumbus
Joe Halpern
Éva Tardos
107
2
0
31 May 2024
Batched Stochastic Bandit for Nondegenerate Functions
Yu Liu
Yunlu Shu
Tianyu Wang
115
0
0
09 May 2024
Reward-Driven Automated Curriculum Learning for Interaction-Aware Self-Driving at Unsignalized Intersections
Zeng Peng
Xiao Zhou
Lei Zheng
Yubin Wang
Jun Ma
152
4
0
20 Mar 2024
Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents
Seyed A. Esmaeili
Suho Shin
Aleksandrs Slivkins
83
4
0
13 Dec 2023
Active teacher selection for reinforcement learning from human feedback
Rachel Freedman
Justin Svegliato
K. H. Wray
Stuart J. Russell
119
6
0
23 Oct 2023
Bandit Social Learning: Exploration under Myopic Behavior
Kiarash Banihashem
Mohammadtaghi Hajiaghayi
Suho Shin
Aleksandrs Slivkins
244
4
0
15 Feb 2023
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Siddhartha Banerjee
Sean R. Sinclair
Milind Tambe
Lily Xu
Chao Yu
AI4TS
101
7
0
30 Sep 2022
Learning in Stackelberg Games with Non-myopic Agents
Nika Haghtalab
Thodoris Lykouris
Sloan Nietert
Alexander Wei
112
32
0
19 Aug 2022
Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds
Zichuan Xu
Jiangkai Wu
Qiufen Xia
Pan Zhou
Jiankang Ren
Huizhi Liang
93
4
0
12 Aug 2020
Model Selection in Contextual Stochastic Bandit Problems
Aldo Pacchiano
My Phan
Yasin Abbasi-Yadkori
Anup B. Rao
Julian Zimmert
Tor Lattimore
Csaba Szepesvári
134
94
0
03 Mar 2020
Introduction to Online Convex Optimization
Elad Hazan
OffRL
121
1,922
0
07 Sep 2019
Connections Between Mirror Descent, Thompson Sampling and the Information Ratio
Julian Zimmert
Tor Lattimore
101
34
0
28 May 2019
Fiduciary Bandits
Gal Bahar
Omer Ben-Porat
Kevin Leyton-Brown
Moshe Tennenholtz
82
9
0
16 May 2019
Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without
Sébastien Bubeck
Yuanzhi Li
Yuval Peres
Mark Sellke
80
45
0
28 Apr 2019
Better Algorithms for Stochastic Bandits with Adversarial Corruptions
Anupam Gupta
Tomer Koren
Kunal Talwar
AAML
75
152
0
22 Feb 2019
Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting
A. Krishnamurthy
John Langford
Aleksandrs Slivkins
Chicheng Zhang
OffRL
105
66
0
05 Feb 2019
A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free
Yifang Chen
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
92
132
0
03 Feb 2019
Improved Path-length Regret Bounds for Bandits
Sébastien Bubeck
Yuanzhi Li
Haipeng Luo
Chen-Yu Wei
82
46
0
29 Jan 2019
Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously
Julian Zimmert
Haipeng Luo
Chen-Yu Wei
75
81
0
25 Jan 2019
Adversarial Bandits with Knapsacks
Nicole Immorlica
Karthik Abinav Sankararaman
Robert Schapire
Aleksandrs Slivkins
105
113
0
28 Nov 2018
Unifying the stochastic and the adversarial Bandits with Knapsack
A. Rangi
M. Franceschetti
Long Tran-Thanh
92
27
0
23 Oct 2018
SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits
Etienne Boursier
Vianney Perchet
79
99
0
21 Sep 2018
Acceleration through Optimistic No-Regret Dynamics
Jun-Kun Wang
Jacob D. Abernethy
87
44
0
27 Jul 2018
The Externalities of Exploration and How Data Diversity Helps Exploitation
Manish Raghavan
Aleksandrs Slivkins
Jennifer Wortman Vaughan
Zhiwei Steven Wu
147
52
0
01 Jun 2018
Stochastic bandits robust to adversarial corruptions
Thodoris Lykouris
Vahab Mirrokni
R. Leme
AAML
91
203
0
25 Mar 2018
A Reductions Approach to Fair Classification
Alekh Agarwal
A. Beygelzimer
Miroslav Dudík
John Langford
Hanna M. Wallach
FaML
171
1,094
0
06 Mar 2018
Practical Contextual Bandits with Regression Oracles
Dylan J. Foster
Alekh Agarwal
Miroslav Dudík
Haipeng Luo
Robert Schapire
256
125
0
03 Mar 2018
A Contextual Bandit Bake-off
A. Bietti
Alekh Agarwal
John Langford
225
104
0
12 Feb 2018
More Adaptive Algorithms for Adversarial Bandits
Chen-Yu Wei
Haipeng Luo
95
181
0
10 Jan 2018
Selling to a No-Regret Buyer
M. Braverman
Jieming Mao
Jon Schneider
Matt Weinberg
89
83
0
25 Nov 2017
Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness
Michael Kearns
Seth Neel
Aaron Roth
Zhiwei Steven Wu
FaML
136
775
0
14 Nov 2017
Sparsity, variance and curvature in multi-armed bandits
Sébastien Bubeck
Michael B. Cohen
Yuanzhi Li
97
60
0
03 Nov 2017
Training GANs with Optimism
C. Daskalakis
Andrew Ilyas
Vasilis Syrgkanis
Haoyang Zeng
129
514
0
31 Oct 2017
1
2
3
Next