ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1209.2693
  4. Cited By
Regret Bounds for Restless Markov Bandits

Regret Bounds for Restless Markov Bandits

12 September 2012
R. Ortner
D. Ryabko
P. Auer
Rémi Munos
ArXiv (abs)PDFHTML

Papers citing "Regret Bounds for Restless Markov Bandits"

31 / 31 papers shown
Title
Finite-Time Analysis of Whittle Index based Q-Learning for Restless
  Multi-Armed Bandits with Neural Network Function Approximation
Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation
Guojun Xiong
Jian Li
87
14
0
03 Oct 2023
Autoregressive Bandits
Autoregressive Bandits
Francesco Bacchiocchi
Gianmarco Genalti
Davide Maran
Marco Mussi
Marcello Restelli
N. Gatti
Alberto Maria Metelli
70
5
0
12 Dec 2022
Networked Restless Bandits with Positive Externalities
Networked Restless Bandits with Positive Externalities
Christine Herlihy
John P. Dickerson
84
3
0
09 Dec 2022
Stochastic Rising Bandits
Stochastic Rising Bandits
Alberto Maria Metelli
F. Trovò
Matteo Pirola
Marcello Restelli
51
18
0
07 Dec 2022
Non-Stationary Bandits with Auto-Regressive Temporal Dependency
Non-Stationary Bandits with Auto-Regressive Temporal Dependency
Qinyi Chen
Negin Golrezaei
Djallel Bouneffouf
AI4TS
91
13
0
28 Oct 2022
Optimistic Whittle Index Policy: Online Learning for Restless Bandits
Optimistic Whittle Index Policy: Online Learning for Restless Bandits
Kai Wang
Lily Xu
Aparna Taneja
Milind Tambe
84
17
0
30 May 2022
Non-Stationary Bandit Learning via Predictive Sampling
Non-Stationary Bandit Learning via Predictive Sampling
Yueyang Liu
Kuang Xu
Benjamin Van Roy
124
18
0
04 May 2022
On learning Whittle index policy for restless bandits with scalable
  regret
On learning Whittle index policy for restless bandits with scalable regret
N. Akbarzadeh
Aditya Mahajan
97
13
0
07 Feb 2022
A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits
A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits
Yasin Abbasi-Yadkori
András Gyorgy
N. Lazić
77
22
0
17 Jan 2022
Offline RL With Resource Constrained Online Deployment
Offline RL With Resource Constrained Online Deployment
Jayanth Reddy Regatti
A. Deshmukh
Frank Cheng
Young Hun Jung
Abhishek Gupta
Ürün Dogan
OffRL
74
2
0
07 Oct 2021
Sublinear Regret for Learning POMDPs
Sublinear Regret for Learning POMDPs
Yi Xiong
Ningyuan Chen
Xuefeng Gao
Xiang Zhou
99
25
0
08 Jul 2021
Learning to Detect an Odd Restless Markov Arm with a Trembling Hand
Learning to Detect an Odd Restless Markov Arm with a Trembling Hand
P. Karthik
R. Sundaresan
47
5
0
08 May 2021
Restless-UCB, an Efficient and Low-complexity Algorithm for Online
  Restless Bandits
Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits
Siwei Wang
Longbo Huang
John C. S. Lui
OffRL
91
39
0
05 Nov 2020
Detecting an Odd Restless Markov Arm with a Trembling Hand
Detecting an Odd Restless Markov Arm with a Trembling Hand
P. Karthik
R. Sundaresan
77
6
0
13 May 2020
Regime Switching Bandits
Regime Switching Bandits
Xiang Zhou
Yi Xiong
Ningyuan Chen
Xuefeng Gao
81
19
0
26 Jan 2020
The Restless Hidden Markov Bandit with Linear Rewards and Side
  Information
The Restless Hidden Markov Bandit with Linear Rewards and Side Information
M. Yemini
Amir Leshem
A. Somekh-Baruch
56
4
0
22 Oct 2019
Thompson Sampling in Non-Episodic Restless Bandits
Thompson Sampling in Non-Episodic Restless Bandits
Young Hun Jung
Marc Abeille
Ambuj Tewari
61
19
0
12 Oct 2019
Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive
  Policies
Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies
Wesley Cowan
M. Katehakis
Daniel Pirutinsky
OffRL
44
4
0
13 Sep 2019
Restless dependent bandits with fading memory
Restless dependent bandits with fading memory
O. Zadorozhnyi
Gilles Blanchard
Alexandra Carpentier
18
0
0
25 Jun 2019
Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems
Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems
Young Hun Jung
Ambuj Tewari
76
44
0
29 May 2019
Learning Multiple Markov Chains via Adaptive Allocation
Learning Multiple Markov Chains via Adaptive Allocation
M. S. Talebi
Odalric-Ambrym Maillard
50
1
0
27 May 2019
Variational Regret Bounds for Reinforcement Learning
Variational Regret Bounds for Reinforcement Learning
Pratik Gajane
R. Ortner
P. Auer
137
62
0
14 May 2019
A Survey of Learning in Multiagent Environments: Dealing with
  Non-Stationarity
A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity
Pablo Hernandez-Leal
Michael Kaisers
T. Baarslag
Enrique Munoz de Cote
88
275
0
28 Jul 2017
Approximations of the Restless Bandit Problem
Approximations of the Restless Bandit Problem
Steffen Grunewalder
A. Khaleghi
46
11
0
22 Feb 2017
Online Learning for Wireless Distributed Computing
Online Learning for Wireless Distributed Computing
Yi-Hsuan Kao
Kwame-Lante Wright
Bhaskar Krishnamachari
F. Bai
60
6
0
09 Nov 2016
Time-Varying Gaussian Process Bandit Optimization
Time-Varying Gaussian Process Bandit Optimization
Ilija Bogunovic
Jonathan Scarlett
Volkan Cevher
128
98
0
25 Jan 2016
When are Kalman-filter restless bandits indexable?
When are Kalman-filter restless bandits indexable?
C. Dance
T. Silander
34
12
0
15 Sep 2015
Optimal Exploration-Exploitation in a Multi-Armed-Bandit Problem with
  Non-stationary Rewards
Optimal Exploration-Exploitation in a Multi-Armed-Bandit Problem with Non-stationary Rewards
Omar Besbes
Y. Gur
A. Zeevi
88
127
0
13 May 2014
Dynamic Rate and Channel Selection in Cognitive Radio Systems
Dynamic Rate and Channel Selection in Cognitive Radio Systems
Richard Combes
Alexandre Proutiere
94
46
0
23 Feb 2014
Regret Bounds for Reinforcement Learning with Policy Advice
Regret Bounds for Reinforcement Learning with Policy Advice
M. G. Azar
A. Lazaric
Emma Brunskill
127
36
0
05 May 2013
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
R. Ortner
D. Ryabko
OffRL
120
85
0
11 Feb 2013
1