v1v2 (latest)

Nonstochastic Multiarmed Bandits with Unrestricted Delays

3 June 2019

Papers citing "Nonstochastic Multiarmed Bandits with Unrestricted Delays"

22 / 22 papers shown

Title
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs Alexander Ryabchenko Idan Attias Daniel M. Roy CLL 62 1 0 25 Mar 2025
Contextual Linear Bandits with Delay as Payoff Mengxiao Zhang Yingfei Wang Haipeng Luo 193 2 0 18 Feb 2025
Biased Dueling Bandits with Stochastic Delayed Feedback Bongsoo Yi Yue Kang Yao Li 89 1 0 26 Aug 2024
Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward Washim Uddin Mondal Vaneet Aggarwal 83 2 0 04 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback Yunchang Yang Hangshi Zhong Tianhao Wu B. Liu Liwei Wang S. Du OffRL 135 8 0 03 Feb 2023
On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits Jialin Yi Milan Vojnović 65 3 0 30 Nov 2022
Dynamical Linear Bandits Marco Mussi Alberto Maria Metelli Marcello Restelli 71 2 0 16 Nov 2022
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback Saeed Masoudian Julian Zimmert Yevgeny Seldin 71 20 0 29 Jun 2022
Multi-Armed Bandit Problem with Temporally-Partitioned Rewards: When Partial Feedback Counts Giulia Romano Andrea Agostini F. Trovò N. Gatti Marcello Restelli 42 2 0 01 Jun 2022
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback Yan Dai Haipeng Luo Liyu Chen 112 19 0 26 May 2022
Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback Tianyi Lin Aldo Pacchiano Yaodong Yu Michael I. Jordan 80 0 0 15 May 2022
Partial Likelihood Thompson Sampling Han Wu Stefan Wager LM&MA 56 2 0 02 Mar 2022
Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences Aadirupa Saha Pierre Gaillard 71 7 0 14 Feb 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback Tiancheng Jin Tal Lancewicki Haipeng Luo Yishay Mansour Aviv A. Rosenberg 131 22 0 31 Jan 2022
Isotuning With Applications To Scale-Free Online Learning Laurent Orseau Marcus Hutter 74 6 0 29 Dec 2021
Nonstochastic Bandits and Experts with Arm-Dependent Delays Dirk van der Hoeven Nicolò Cesa-Bianchi 78 17 0 02 Nov 2021
Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays Jiatai Huang Yan Dai Longbo Huang AI4CE 76 2 0 26 Oct 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback Tal Lancewicki Aviv A. Rosenberg Yishay Mansour 109 35 0 29 Dec 2020
Peer Offloading with Delayed Feedback in Fog Networks Miao Yang Hongbin Zhu H. Qian Y. Koucheryavy K. Samouylov Haifeng Wang 28 12 0 24 Nov 2020
Stochastic bandits with arm-dependent delays Anne Gael Manegueu Claire Vernade Alexandra Carpentier Michal Valko 80 45 0 18 Jun 2020
To update or not to update? Delayed Nonparametric Bandits with Randomized Allocation Sakshi Arya Yuhong Yang 28 0 0 26 May 2020
An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays Julian Zimmert Yevgeny Seldin 83 53 0 14 Oct 2019