Q-Learning Lagrange Policies for Multi-Action Restless Bandits

Q-Learning Lagrange Policies for Multi-Action Restless Bandits

22 June 2021

Papers citing "Q-Learning Lagrange Policies for Multi-Action Restless Bandits"

17 / 17 papers shown

Title
Whittle Index Learning Algorithms for Restless Bandits with Constant Stepsizes Vishesh Mittal R. Meshram Surya Prakash 16 0 0 06 Sep 2024
The Bandit Whisperer: Communication Learning for Restless Bandits Yunfan Zhao Tonghan Wang Dheeraj M. Nagaraj Aparna Taneja Milind Tambe 44 5 0 11 Aug 2024
Tabular and Deep Learning for the Whittle Index Francisco Robledo Relaño Vivek Borkar U. Ayesta Konstantin Avrachenkov 16 2 0 04 Jun 2024
Deep reinforcement learning for weakly coupled MDP's with continuous actions Francisco Robledo U. Ayesta Konstantin Avrachenkov 23 0 0 03 Jun 2024
Structured Reinforcement Learning for Delay-Optimal Data Transmission in Dense mmWave Networks Shu-Fan Wang Guojun Xiong Shichen Zhang Huacheng Zeng Jian Li Shivendra Panwar 17 0 0 25 Apr 2024
Evaluating the Effectiveness of Index-Based Treatment Allocation Niclas Boehmer Yash Nair Sanket Shah Lucas Janson Aparna Taneja Milind Tambe 18 3 0 19 Feb 2024
Fairness of Exposure in Online Restless Multi-armed Bandits Archit Sood Shweta Jain Sujit Gujar 21 1 0 09 Feb 2024
Online Restless Multi-Armed Bandits with Long-Term Fairness Constraints Shu-Fan Wang Guojun Xiong Jian Li 36 6 0 16 Dec 2023
Weakly Coupled Deep Q-Networks Ibrahim El Shar Daniel R. Jiang 11 2 0 28 Oct 2023
Towards a Pretrained Model for Restless Bandits via Multi-arm Generalization Yunfan Zhao Nikhil Behari Edward Hughes Edwin Zhang Dheeraj M. Nagaraj K. Tuyls Aparna Taneja Milind Tambe 8 7 0 23 Oct 2023
Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation Guojun Xiong Jian Li 20 12 0 03 Oct 2023
Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits Abheek Ghosh Dheeraj M. Nagaraj Manish Jain Milind Tambe 11 9 0 31 Oct 2022
DeepTOP: Deep Threshold-Optimal Policy for MDPs and RMABs Khaled Nakhleh I.-Hong Hou 62 5 0 18 Sep 2022
Optimistic Whittle Index Policy: Online Learning for Restless Bandits Kai Wang Lily Xu Aparna Taneja Milind Tambe 31 16 0 30 May 2022
Field Study in Deploying Restless Multi-Armed Bandits: Assisting Non-Profits in Improving Maternal and Child Health Aditya Mate Lovish Madaan Aparna Taneja N. Madhiwalla Shresth Verma Gargi Singh Aparna Hegde Pradeep Varakantham Milind Tambe 13 52 0 16 Sep 2021
Restless and Uncertain: Robust Policies for Restless Bandits via Deep Multi-Agent Reinforcement Learning J. Killian Lily Xu Arpita Biswas Milind Tambe 12 5 0 04 Jul 2021
Planning to Fairly Allocate: Probabilistic Fairness in the Restless Bandit Setting Christine Herlihy Aviva Prins A. Srinivasan John P. Dickerson 10 13 0 14 Jun 2021