Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.07911
Cited By
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
13 May 2023
Tal Lancewicki
Aviv A. Rosenberg
Dmitry Sotnikov
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback"
8 / 8 papers shown
Title
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf B. Cassel
Haipeng Luo
Aviv A. Rosenberg
Dmitry Sotnikov
OffRL
14
3
0
13 May 2024
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
Saeed Masoudian
Julian Zimmert
Yevgeny Seldin
18
18
0
29 Jun 2022
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback
Yan Dai
Haipeng Luo
Liyu Chen
52
19
0
26 May 2022
Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs
Dongruo Zhou
Quanquan Gu
73
43
0
23 May 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
59
21
0
31 Jan 2022
Cooperative Online Learning in Stochastic and Adversarial MDPs
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
56
3
0
31 Jan 2022
Nonstochastic Bandits with Composite Anonymous Feedback
Nicolò Cesa-Bianchi
Tommaso Cesari
Roberto Colomboni
Claudio Gentile
Yishay Mansour
72
39
0
06 Dec 2021
Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation
Daniel Vial
Advait Parulekar
Sanjay Shakkottai
R. Srikant
24
15
0
04 May 2021
1