Introduction to Multi-Armed Bandits

15 April 2019

Papers citing "Introduction to Multi-Armed Bandits"

50 / 137 papers shown

Title
Counterfactual Multi-player Bandits for Explainable Recommendation Diversification Yansen Zhang Bowei He Xiaokun Zhang Haolun Wu Zexu Sun Chen Ma 145 1 0 27 May 2025
Robust Online Learning with Private Information Kyohei Okumura 123 0 0 08 May 2025
OptimAI: Optimization from Natural Language Using LLM-Powered AI Agents Raghav Thind Youran Sun Ling Liang Haizhao Yang LLMAG 146 0 0 23 Apr 2025
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries Arnab Maiti Zhiyuan Fan Kevin Jamieson Lillian J. Ratliff Gabriele Farina 313 0 0 01 Apr 2025
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure Aleksandrs Slivkins Yunzong Xu Shiliang Zuo 352 1 0 06 Mar 2025
A Theoretical Model for Grit in Pursuing Ambitious Ends Avrim Blum Emily Diana Kavya Ravichandran A. Tolbert 129 0 0 04 Mar 2025
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context Jianyu Xu Qiuzhuang Sun Yang Yang Huadong Mo Daoyi Dong 173 0 0 24 Feb 2025
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization Zishun Yu Tengyu Xu Di Jin Karthik Abinav Sankararaman Yun He ... Eryk Helenowski Chen Zhu Sinong Wang Hao Ma Han Fang LRM 145 8 0 29 Jan 2025
Fuzzing at Scale: The Untold Story of the Scheduler Ivica Nikolić Racchit Jain 142 0 0 28 Jan 2025
Online Joint Assortment-Inventory Optimization under MNL Choices Yong Liang Xiaojie Mao Shiyuan Wang 130 0 0 03 Jan 2025
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit Junyu Cao Ruijiang Gao Esmaeil Keyvanshokooh 150 1 0 18 Oct 2024
AutoPersuade: A Framework for Evaluating and Explaining Persuasive Arguments Till Raphael Saenger Musashi Hinck Justin Grimmer Brandon M Stewart 106 2 0 11 Oct 2024
Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering Yuxiang Wang Jianzhong Qi Junhao Gan LMTD 145 3 0 10 Oct 2024
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs Yu Chen Jiatai Huang Yan Dai Longbo Huang 119 0 0 04 Oct 2024
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback Arun Verma Zhongxiang Dai Xiaoqiang Lin Patrick Jaillet K. H. Low 126 5 0 24 Jul 2024
Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality Antoine Scheid Aymeric Capitaine Etienne Boursier Eric Moulines Michael I. Jordan Alain Durmus 115 4 0 28 Jun 2024
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond Xutong Liu Siwei Wang Jinhang Zuo Han Zhong Xuchuang Wang Zhiyong Wang Shuai Li Mohammad Hajiesmaili J. C. Lui Wei Chen 150 3 0 03 Jun 2024
Paying to Do Better: Games with Payments between Learning Agents Y. Kolumbus Joe Halpern Éva Tardos 107 2 0 31 May 2024
Batched Stochastic Bandit for Nondegenerate Functions Yu Liu Yunlu Shu Tianyu Wang 115 0 0 09 May 2024
Reward-Driven Automated Curriculum Learning for Interaction-Aware Self-Driving at Unsignalized Intersections Zeng Peng Xiao Zhou Lei Zheng Yubin Wang Jun Ma 152 4 0 20 Mar 2024
Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents Seyed A. Esmaeili Suho Shin Aleksandrs Slivkins 83 4 0 13 Dec 2023
Active teacher selection for reinforcement learning from human feedback Rachel Freedman Justin Svegliato K. H. Wray Stuart J. Russell 119 6 0 23 Oct 2023
Bandit Social Learning: Exploration under Myopic Behavior Kiarash Banihashem Mohammadtaghi Hajiaghayi Suho Shin Aleksandrs Slivkins 244 4 0 15 Feb 2023
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits Siddhartha Banerjee Sean R. Sinclair Milind Tambe Lily Xu Chao Yu AI4TS 101 7 0 30 Sep 2022
Learning in Stackelberg Games with Non-myopic Agents Nika Haghtalab Thodoris Lykouris Sloan Nietert Alexander Wei 112 32 0 19 Aug 2022
Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds Zichuan Xu Jiangkai Wu Qiufen Xia Pan Zhou Jiankang Ren Huizhi Liang 93 4 0 12 Aug 2020
Model Selection in Contextual Stochastic Bandit Problems Aldo Pacchiano My Phan Yasin Abbasi-Yadkori Anup B. Rao Julian Zimmert Tor Lattimore Csaba Szepesvári 134 94 0 03 Mar 2020
Introduction to Online Convex Optimization Elad Hazan OffRL 121 1,922 0 07 Sep 2019
Connections Between Mirror Descent, Thompson Sampling and the Information Ratio Julian Zimmert Tor Lattimore 101 34 0 28 May 2019
Fiduciary Bandits Gal Bahar Omer Ben-Porat Kevin Leyton-Brown Moshe Tennenholtz 82 9 0 16 May 2019
Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without Sébastien Bubeck Yuanzhi Li Yuval Peres Mark Sellke 80 45 0 28 Apr 2019
Better Algorithms for Stochastic Bandits with Adversarial Corruptions Anupam Gupta Tomer Koren Kunal Talwar AAML 75 152 0 22 Feb 2019
Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting A. Krishnamurthy John Langford Aleksandrs Slivkins Chicheng Zhang OffRL 105 66 0 05 Feb 2019
A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free Yifang Chen Chung-Wei Lee Haipeng Luo Chen-Yu Wei 92 132 0 03 Feb 2019
Improved Path-length Regret Bounds for Bandits Sébastien Bubeck Yuanzhi Li Haipeng Luo Chen-Yu Wei 82 46 0 29 Jan 2019
Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously Julian Zimmert Haipeng Luo Chen-Yu Wei 75 81 0 25 Jan 2019
Adversarial Bandits with Knapsacks Nicole Immorlica Karthik Abinav Sankararaman Robert Schapire Aleksandrs Slivkins 105 113 0 28 Nov 2018
Unifying the stochastic and the adversarial Bandits with Knapsack A. Rangi M. Franceschetti Long Tran-Thanh 92 27 0 23 Oct 2018
SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits Etienne Boursier Vianney Perchet 79 99 0 21 Sep 2018
Acceleration through Optimistic No-Regret Dynamics Jun-Kun Wang Jacob D. Abernethy 87 44 0 27 Jul 2018
The Externalities of Exploration and How Data Diversity Helps Exploitation Manish Raghavan Aleksandrs Slivkins Jennifer Wortman Vaughan Zhiwei Steven Wu 147 52 0 01 Jun 2018
Stochastic bandits robust to adversarial corruptions Thodoris Lykouris Vahab Mirrokni R. Leme AAML 91 203 0 25 Mar 2018
A Reductions Approach to Fair Classification Alekh Agarwal A. Beygelzimer Miroslav Dudík John Langford Hanna M. Wallach FaML 171 1,094 0 06 Mar 2018
Practical Contextual Bandits with Regression Oracles Dylan J. Foster Alekh Agarwal Miroslav Dudík Haipeng Luo Robert Schapire 256 125 0 03 Mar 2018
A Contextual Bandit Bake-off A. Bietti Alekh Agarwal John Langford 225 104 0 12 Feb 2018
More Adaptive Algorithms for Adversarial Bandits Chen-Yu Wei Haipeng Luo 95 181 0 10 Jan 2018
Selling to a No-Regret Buyer M. Braverman Jieming Mao Jon Schneider Matt Weinberg 89 83 0 25 Nov 2017
Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness Michael Kearns Seth Neel Aaron Roth Zhiwei Steven Wu FaML 136 775 0 14 Nov 2017
Sparsity, variance and curvature in multi-armed bandits Sébastien Bubeck Michael B. Cohen Yuanzhi Li 97 60 0 03 Nov 2017
Training GANs with Optimism C. Daskalakis Andrew Ilyas Vasilis Syrgkanis Haoyang Zeng 129 514 0 31 Oct 2017