ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.06022
  4. Cited By
Adapting to Delays and Data in Adversarial Multi-Armed Bandits

Adapting to Delays and Data in Adversarial Multi-Armed Bandits

12 October 2020
András Gyorgy
Pooria Joulani
ArXiv (abs)PDFHTML

Papers citing "Adapting to Delays and Data in Adversarial Multi-Armed Bandits"

13 / 13 papers shown
Title
Contextual Linear Bandits with Delay as Payoff
Contextual Linear Bandits with Delay as Payoff
Mengxiao Zhang
Yingfei Wang
Haipeng Luo
193
2
0
18 Feb 2025
A Reduction-based Framework for Sequential Decision Making with Delayed
  Feedback
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
135
8
0
03 Feb 2023
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
Saeed Masoudian
Julian Zimmert
Yevgeny Seldin
71
20
0
29 Jun 2022
Generalized Delayed Feedback Model with Post-Click Information in
  Recommender Systems
Generalized Delayed Feedback Model with Post-Click Information in Recommender Systems
Jia-Qi Yang
De-Chuan Zhan
58
10
0
01 Jun 2022
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes
  with Bandit Feedback
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback
Yan Dai
Haipeng Luo
Liyu Chen
114
19
0
26 May 2022
Online Nonsubmodular Minimization with Delayed Costs: From Full
  Information to Bandit Feedback
Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback
Tianyi Lin
Aldo Pacchiano
Yaodong Yu
Michael I. Jordan
80
0
0
15 May 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
131
22
0
31 Jan 2022
Isotuning With Applications To Scale-Free Online Learning
Isotuning With Applications To Scale-Free Online Learning
Laurent Orseau
Marcus Hutter
76
6
0
29 Dec 2021
Nonstochastic Bandits and Experts with Arm-Dependent Delays
Nonstochastic Bandits and Experts with Arm-Dependent Delays
Dirk van der Hoeven
Nicolò Cesa-Bianchi
78
17
0
02 Nov 2021
Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays
Jiatai Huang
Yan Dai
Longbo Huang
AI4CE
76
2
0
26 Oct 2021
Machine Learning for Fraud Detection in E-Commerce: A Research Agenda
Machine Learning for Fraud Detection in E-Commerce: A Research Agenda
Niek Tax
Kees Jan de Vries
Mathijs de Jong
Nikoleta Dosoula
Bram van den Akker
Jon Smith
Olivier Thuong
Lucas Bernardi
34
21
0
05 Jul 2021
Cooperative Online Learning with Feedback Graphs
Cooperative Online Learning with Feedback Graphs
Nicolò Cesa-Bianchi
Tommaso Cesari
R. D. Vecchia
71
3
0
09 Jun 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
109
35
0
29 Dec 2020
1