ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.02436
  4. Cited By
Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions

Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions

4 June 2021
Tal Lancewicki
Shahar Segal
Tomer Koren
Yishay Mansour
ArXiv (abs)PDFHTML

Papers citing "Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions"

30 / 30 papers shown
Title
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs
Alexander Ryabchenko
Idan Attias
Daniel M. Roy
CLL
62
1
0
25 Mar 2025
Contextual Linear Bandits with Delay as Payoff
Contextual Linear Bandits with Delay as Payoff
Mengxiao Zhang
Yingfei Wang
Haipeng Luo
193
2
0
18 Feb 2025
Individual Regret in Cooperative Stochastic Multi-Armed Bandits
Individual Regret in Cooperative Stochastic Multi-Armed Bandits
Idan Barnea
Tal Lancewicki
Yishay Mansour
38
0
0
10 Nov 2024
Robust Thompson Sampling Algorithms Against Reward Poisoning Attacks
Robust Thompson Sampling Algorithms Against Reward Poisoning Attacks
Yinglun Xu
Zhiwei Wang
Gagandeep Singh
AAML
77
1
0
25 Oct 2024
Biased Dueling Bandits with Stochastic Delayed Feedback
Biased Dueling Bandits with Stochastic Delayed Feedback
Bongsoo Yi
Yue Kang
Yao Li
89
1
0
26 Aug 2024
Faster Stochastic Optimization with Arbitrary Delays via Asynchronous Mini-Batching
Faster Stochastic Optimization with Arbitrary Delays via Asynchronous Mini-Batching
Amit Attia
Ofir Gaash
Tomer Koren
64
0
0
14 Aug 2024
Merit-based Fair Combinatorial Semi-Bandit with Unrestricted Feedback
  Delays
Merit-based Fair Combinatorial Semi-Bandit with Unrestricted Feedback Delays
Ziqun Chen
Kechao Cai
Zhuoyue Chen
Jinbei Zhang
John C. S. Lui
FaML
112
0
0
22 Jul 2024
Non-stochastic Bandits With Evolving Observations
Non-stochastic Bandits With Evolving Observations
Yogev Bar-On
Yishay Mansour
81
1
0
27 May 2024
Adversarial Bandits with Multi-User Delayed Feedback: Theory and
  Application
Adversarial Bandits with Multi-User Delayed Feedback: Theory and Application
Yandi Li
Jianxiong Guo
Yupeng Li
Tian-sheng Wang
Weijia Jia
134
1
0
17 Oct 2023
Regret Analysis of Repeated Delegated Choice
Regret Analysis of Repeated Delegated Choice
Mohammadtaghi Hajiaghayi
Mohammad Mahdavi
Keivan Rezaei
Suho Shin
88
4
0
07 Oct 2023
Statistical Inference on Multi-armed Bandits with Delayed Feedback
Statistical Inference on Multi-armed Bandits with Delayed Feedback
Lei Shi
Jingshen Wang
Tianhao Wu
96
4
0
03 Jul 2023
Efficient Reinforcement Learning with Impaired Observability: Learning
  to Act with Delayed and Missing State Observations
Efficient Reinforcement Learning with Impaired Observability: Learning to Act with Delayed and Missing State Observations
Minshuo Chen
Jie Meng
Yunru Bai
Yinyu Ye
H. Vincent Poor
Mengdi Wang
71
0
0
02 Jun 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial
  Semi-Bandits, Linear Bandits, and MDPs
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv A. Rosenberg
Nicolò Cesa-Bianchi
83
6
0
15 May 2023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial
  MDP with Delayed Bandit Feedback
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
Tal Lancewicki
Aviv A. Rosenberg
Dmitry Sotnikov
55
3
0
13 May 2023
Reinforcement Learning with Delayed, Composite, and Partially Anonymous
  Reward
Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward
Washim Uddin Mondal
Vaneet Aggarwal
83
2
0
04 May 2023
Effective Dimension in Bandit Problems under Censorship
Effective Dimension in Bandit Problems under Censorship
G. Guinet
Saurabh Amin
Patrick Jaillet
44
1
0
14 Feb 2023
A Reduction-based Framework for Sequential Decision Making with Delayed
  Feedback
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
135
8
0
03 Feb 2023
Practical Bandits: An Industry Perspective
Practical Bandits: An Industry Perspective
Bram van den Akker
Olivier Jeunen
Ying Li
Ben London
Zahra Nazari
Devesh Parekh
66
6
0
02 Feb 2023
Stochastic Contextual Bandits with Long Horizon Rewards
Stochastic Contextual Bandits with Long Horizon Rewards
Yuzhen Qin
Yingcong Li
Fabio Pasqualetti
Maryam Fazel
Samet Oymak
93
3
0
02 Feb 2023
Dynamical Linear Bandits
Dynamical Linear Bandits
Marco Mussi
Alberto Maria Metelli
Marcello Restelli
71
2
0
16 Nov 2022
Extending Open Bandit Pipeline to Simulate Industry Challenges
Extending Open Bandit Pipeline to Simulate Industry Challenges
Bram van den Akker
N. Weber
Felipe Moraes
Dmitri Goldenberg
OffRL
47
1
0
09 Sep 2022
Learning in Stackelberg Games with Non-myopic Agents
Learning in Stackelberg Games with Non-myopic Agents
Nika Haghtalab
Thodoris Lykouris
Sloan Nietert
Alexander Wei
189
32
0
19 Aug 2022
Some performance considerations when using multi-armed bandit algorithms
  in the presence of missing data
Some performance considerations when using multi-armed bandit algorithms in the presence of missing data
Xijin Chen
K. M. Lee
S. Villar
D. Robertson
96
1
0
08 May 2022
Partial Likelihood Thompson Sampling
Partial Likelihood Thompson Sampling
Han Wu
Stefan Wager
LM&MA
60
2
0
02 Mar 2022
Thompson Sampling with Unrestricted Delays
Thompson Sampling with Unrestricted Delays
Hang Wu
Stefan Wager
75
8
0
24 Feb 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
131
22
0
31 Jan 2022
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary
  Dueling Bandits
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits
Aadirupa Saha
Shubham Gupta
61
10
0
06 Nov 2021
Nonstochastic Bandits and Experts with Arm-Dependent Delays
Nonstochastic Bandits and Experts with Arm-Dependent Delays
Dirk van der Hoeven
Nicolò Cesa-Bianchi
78
17
0
02 Nov 2021
No Weighted-Regret Learning in Adversarial Bandits with Delays
No Weighted-Regret Learning in Adversarial Bandits with Delays
Ilai Bistritz
Zhengyuan Zhou
Xi Chen
Nicholas Bambos
Jose H. Blanchet
78
7
0
08 Mar 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
109
35
0
29 Dec 2020
1