ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.06054
  4. Cited By
An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays
v1v2 (latest)

An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays

14 October 2019
Julian Zimmert
Yevgeny Seldin
ArXiv (abs)PDFHTML

Papers citing "An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays"

37 / 37 papers shown
Title
Exploiting Curvature in Online Convex Optimization with Delayed Feedback
Exploiting Curvature in Online Convex Optimization with Delayed Feedback
Hao Qiu
Emmanuel Esposito
Mengxiao Zhang
24
0
0
09 Jun 2025
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs
Alexander Ryabchenko
Idan Attias
Daniel M. Roy
CLL
62
1
0
25 Mar 2025
Bandit and Delayed Feedback in Online Structured Prediction
Bandit and Delayed Feedback in Online Structured Prediction
Yuki Shibukawa
Taira Tsuchiya
Shinsaku Sakaue
Kenji Yamanishi
OffRL
111
0
0
26 Feb 2025
Contextual Linear Bandits with Delay as Payoff
Contextual Linear Bandits with Delay as Payoff
Mengxiao Zhang
Yingfei Wang
Haipeng Luo
193
2
0
18 Feb 2025
Biased Dueling Bandits with Stochastic Delayed Feedback
Biased Dueling Bandits with Stochastic Delayed Feedback
Bongsoo Yi
Yue Kang
Yao Li
89
1
0
26 Aug 2024
Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive
  Analysis and Best-of-Both-Worlds
Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive Analysis and Best-of-Both-Worlds
Shinji Ito
Taira Tsuchiya
Junya Honda
81
4
0
01 Mar 2024
Posterior Sampling with Delayed Feedback for Reinforcement Learning with
  Linear Function Approximation
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation
Nikki Lijing Kuang
Ming Yin
Mengdi Wang
Yu Wang
Yian Ma
90
6
0
29 Oct 2023
Adversarial Bandits with Multi-User Delayed Feedback: Theory and
  Application
Adversarial Bandits with Multi-User Delayed Feedback: Theory and Application
Yandi Li
Jianxiong Guo
Yupeng Li
Tian-sheng Wang
Weijia Jia
132
1
0
17 Oct 2023
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with
  Robustness to Excessive Delays
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays
Saeed Masoudian
Julian Zimmert
Yevgeny Seldin
92
5
0
21 Aug 2023
Delayed Bandits: When Do Intermediate Observations Help?
Delayed Bandits: When Do Intermediate Observations Help?
Emmanuel Esposito
Saeed Masoudian
Hao Qiu
Dirk van der Hoeven
Nicolò Cesa-Bianchi
Yevgeny Seldin
49
3
0
30 May 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial
  Semi-Bandits, Linear Bandits, and MDPs
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv A. Rosenberg
Nicolò Cesa-Bianchi
83
6
0
15 May 2023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial
  MDP with Delayed Bandit Feedback
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
Tal Lancewicki
Aviv A. Rosenberg
Dmitry Sotnikov
55
3
0
13 May 2023
Reinforcement Learning with Delayed, Composite, and Partially Anonymous
  Reward
Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward
Washim Uddin Mondal
Vaneet Aggarwal
83
2
0
04 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed
  Feedback
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
135
8
0
03 Feb 2023
Stochastic Contextual Bandits with Long Horizon Rewards
Stochastic Contextual Bandits with Long Horizon Rewards
Yuzhen Qin
Yingcong Li
Fabio Pasqualetti
Maryam Fazel
Samet Oymak
93
3
0
02 Feb 2023
Banker Online Mirror Descent: A Universal Approach for Delayed Online
  Bandit Learning
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
Jiatai Huang
Yan Dai
Longbo Huang
67
6
0
25 Jan 2023
Multi-Agent Reinforcement Learning with Reward Delays
Multi-Agent Reinforcement Learning with Reward Delays
Yuyang Zhang
Runyu Zhang
Yu Gu
Na Li
79
10
0
02 Dec 2022
On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits
On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits
Jialin Yi
Milan Vojnović
65
3
0
30 Nov 2022
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
Saeed Masoudian
Julian Zimmert
Yevgeny Seldin
71
20
0
29 Jun 2022
Lazy Queries Can Reduce Variance in Zeroth-order Optimization
Lazy Queries Can Reduce Variance in Zeroth-order Optimization
Quan-Wu Xiao
Qing Ling
Tianyi Chen
85
0
0
14 Jun 2022
Online Nonsubmodular Minimization with Delayed Costs: From Full
  Information to Bandit Feedback
Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback
Tianyi Lin
Aldo Pacchiano
Yaodong Yu
Michael I. Jordan
80
0
0
15 May 2022
Bounded Memory Adversarial Bandits with Composite Anonymous Delayed
  Feedback
Bounded Memory Adversarial Bandits with Composite Anonymous Delayed Feedback
Zongqi Wan
Xiaoming Sun
Jialin Zhang
43
1
0
27 Apr 2022
Versatile Dueling Bandits: Best-of-both-World Analyses for Online
  Learning from Preferences
Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences
Aadirupa Saha
Pierre Gaillard
71
7
0
14 Feb 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
131
22
0
31 Jan 2022
Isotuning With Applications To Scale-Free Online Learning
Isotuning With Applications To Scale-Free Online Learning
Laurent Orseau
Marcus Hutter
76
6
0
29 Dec 2021
Nonstochastic Bandits with Composite Anonymous Feedback
Nonstochastic Bandits with Composite Anonymous Feedback
Nicolò Cesa-Bianchi
Tommaso Cesari
Roberto Colomboni
Claudio Gentile
Yishay Mansour
191
40
0
06 Dec 2021
Nonstochastic Bandits and Experts with Arm-Dependent Delays
Nonstochastic Bandits and Experts with Arm-Dependent Delays
Dirk van der Hoeven
Nicolò Cesa-Bianchi
78
17
0
02 Nov 2021
Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays
Jiatai Huang
Yan Dai
Longbo Huang
AI4CE
76
2
0
26 Oct 2021
Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions
Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions
Tal Lancewicki
Shahar Segal
Tomer Koren
Yishay Mansour
74
41
0
04 Jun 2021
No Weighted-Regret Learning in Adversarial Bandits with Delays
No Weighted-Regret Learning in Adversarial Bandits with Delays
Ilai Bistritz
Zhengyuan Zhou
Xi Chen
Nicholas Bambos
Jose H. Blanchet
65
7
0
08 Mar 2021
Adversarial Tracking Control via Strongly Adaptive Online Learning with
  Memory
Adversarial Tracking Control via Strongly Adaptive Online Learning with Memory
Zhiyu Zhang
Ashok Cutkosky
I. Paschalidis
94
15
0
02 Feb 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
109
35
0
29 Dec 2020
Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity,
  and Optimism
Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism
Yu-Guan Hsieh
F. Iutzeler
J. Malick
P. Mertikopoulos
AI4CE
101
30
0
21 Dec 2020
Adapting to Delays and Data in Adversarial Multi-Armed Bandits
Adapting to Delays and Data in Adversarial Multi-Armed Bandits
András Gyorgy
Pooria Joulani
38
31
0
12 Oct 2020
To update or not to update? Delayed Nonparametric Bandits with
  Randomized Allocation
To update or not to update? Delayed Nonparametric Bandits with Randomized Allocation
Sakshi Arya
Yuhong Yang
28
0
0
26 May 2020
Regret Bounds for Batched Bandits
Regret Bounds for Batched Bandits
Hossein Esfandiari
Amin Karbasi
Abbas Mehrabian
Vahab Mirrokni
92
63
0
11 Oct 2019
Nonstochastic Multiarmed Bandits with Unrestricted Delays
Nonstochastic Multiarmed Bandits with Unrestricted Delays
Tobias Sommer Thune
Nicolò Cesa-Bianchi
Yevgeny Seldin
96
53
0
03 Jun 2019
1