v1v2 (latest)

An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays

14 October 2019

Papers citing "An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays"

37 / 37 papers shown

Title
Exploiting Curvature in Online Convex Optimization with Delayed Feedback Hao Qiu Emmanuel Esposito Mengxiao Zhang 24 0 0 09 Jun 2025
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs Alexander Ryabchenko Idan Attias Daniel M. Roy CLL 62 1 0 25 Mar 2025
Bandit and Delayed Feedback in Online Structured Prediction Yuki Shibukawa Taira Tsuchiya Shinsaku Sakaue Kenji Yamanishi OffRL 111 0 0 26 Feb 2025
Contextual Linear Bandits with Delay as Payoff Mengxiao Zhang Yingfei Wang Haipeng Luo 193 2 0 18 Feb 2025
Biased Dueling Bandits with Stochastic Delayed Feedback Bongsoo Yi Yue Kang Yao Li 89 1 0 26 Aug 2024
Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive Analysis and Best-of-Both-Worlds Shinji Ito Taira Tsuchiya Junya Honda 81 4 0 01 Mar 2024
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation Nikki Lijing Kuang Ming Yin Mengdi Wang Yu Wang Yian Ma 90 6 0 29 Oct 2023
Adversarial Bandits with Multi-User Delayed Feedback: Theory and Application Yandi Li Jianxiong Guo Yupeng Li Tian-sheng Wang Weijia Jia 132 1 0 17 Oct 2023
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays Saeed Masoudian Julian Zimmert Yevgeny Seldin 92 5 0 21 Aug 2023
Delayed Bandits: When Do Intermediate Observations Help? Emmanuel Esposito Saeed Masoudian Hao Qiu Dirk van der Hoeven Nicolò Cesa-Bianchi Yevgeny Seldin 49 3 0 30 May 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs Dirk van der Hoeven Lukas Zierahn Tal Lancewicki Aviv A. Rosenberg Nicolò Cesa-Bianchi 83 6 0 15 May 2023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback Tal Lancewicki Aviv A. Rosenberg Dmitry Sotnikov 55 3 0 13 May 2023
Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward Washim Uddin Mondal Vaneet Aggarwal 83 2 0 04 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback Yunchang Yang Hangshi Zhong Tianhao Wu B. Liu Liwei Wang S. Du OffRL 135 8 0 03 Feb 2023
Stochastic Contextual Bandits with Long Horizon Rewards Yuzhen Qin Yingcong Li Fabio Pasqualetti Maryam Fazel Samet Oymak 93 3 0 02 Feb 2023
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning Jiatai Huang Yan Dai Longbo Huang 67 6 0 25 Jan 2023
Multi-Agent Reinforcement Learning with Reward Delays Yuyang Zhang Runyu Zhang Yu Gu Na Li 79 10 0 02 Dec 2022
On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits Jialin Yi Milan Vojnović 65 3 0 30 Nov 2022
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback Saeed Masoudian Julian Zimmert Yevgeny Seldin 71 20 0 29 Jun 2022
Lazy Queries Can Reduce Variance in Zeroth-order Optimization Quan-Wu Xiao Qing Ling Tianyi Chen 85 0 0 14 Jun 2022
Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback Tianyi Lin Aldo Pacchiano Yaodong Yu Michael I. Jordan 80 0 0 15 May 2022
Bounded Memory Adversarial Bandits with Composite Anonymous Delayed Feedback Zongqi Wan Xiaoming Sun Jialin Zhang 43 1 0 27 Apr 2022
Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences Aadirupa Saha Pierre Gaillard 71 7 0 14 Feb 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback Tiancheng Jin Tal Lancewicki Haipeng Luo Yishay Mansour Aviv A. Rosenberg 131 22 0 31 Jan 2022
Isotuning With Applications To Scale-Free Online Learning Laurent Orseau Marcus Hutter 76 6 0 29 Dec 2021
Nonstochastic Bandits with Composite Anonymous Feedback Nicolò Cesa-Bianchi Tommaso Cesari Roberto Colomboni Claudio Gentile Yishay Mansour 191 40 0 06 Dec 2021
Nonstochastic Bandits and Experts with Arm-Dependent Delays Dirk van der Hoeven Nicolò Cesa-Bianchi 78 17 0 02 Nov 2021
Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays Jiatai Huang Yan Dai Longbo Huang AI4CE 76 2 0 26 Oct 2021
Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions Tal Lancewicki Shahar Segal Tomer Koren Yishay Mansour 74 41 0 04 Jun 2021
No Weighted-Regret Learning in Adversarial Bandits with Delays Ilai Bistritz Zhengyuan Zhou Xi Chen Nicholas Bambos Jose H. Blanchet 65 7 0 08 Mar 2021
Adversarial Tracking Control via Strongly Adaptive Online Learning with Memory Zhiyu Zhang Ashok Cutkosky I. Paschalidis 94 15 0 02 Feb 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback Tal Lancewicki Aviv A. Rosenberg Yishay Mansour 109 35 0 29 Dec 2020
Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism Yu-Guan Hsieh F. Iutzeler J. Malick P. Mertikopoulos AI4CE 101 30 0 21 Dec 2020
Adapting to Delays and Data in Adversarial Multi-Armed Bandits András Gyorgy Pooria Joulani 38 31 0 12 Oct 2020
To update or not to update? Delayed Nonparametric Bandits with Randomized Allocation Sakshi Arya Yuhong Yang 28 0 0 26 May 2020
Regret Bounds for Batched Bandits Hossein Esfandiari Amin Karbasi Abbas Mehrabian Vahab Mirrokni 92 63 0 11 Oct 2019
Nonstochastic Multiarmed Bandits with Unrestricted Delays Tobias Sommer Thune Nicolò Cesa-Bianchi Yevgeny Seldin 96 53 0 03 Jun 2019