ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.05686
  4. Cited By
The Bandit Whisperer: Communication Learning for Restless Bandits

The Bandit Whisperer: Communication Learning for Restless Bandits

11 August 2024
Yunfan Zhao
Tonghan Wang
Dheeraj M. Nagaraj
Aparna Taneja
Milind Tambe
ArXivPDFHTML

Papers citing "The Bandit Whisperer: Communication Learning for Restless Bandits"

8 / 8 papers shown
Title
Policy-to-Language: Train LLMs to Explain Decisions with Flow-Matching Generated Rewards
Policy-to-Language: Train LLMs to Explain Decisions with Flow-Matching Generated Rewards
Xinyi Yang
Liang Zeng
Heng Dong
C. Yu
X. Wu
H. Yang
Yu Wang
Milind Tambe
Tonghan Wang
68
2
0
18 Feb 2025
HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversations
HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversations
Ziyu Wang
Hao Li
Di Huang
Amir M. Rahmani
Chae-Won Shin
Amir M. Rahmani
LM&MA
40
8
0
28 Sep 2024
Improving the Prediction of Individual Engagement in Recommendations
  Using Cognitive Models
Improving the Prediction of Individual Engagement in Recommendations Using Cognitive Models
Roderick Seow
Yunfan Zhao
Duncan Wood
Milind Tambe
Cleotilde Gonzalez
25
4
0
28 Aug 2024
A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit
  Tasks in Public Health
A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health
Nikhil Behari
Edwin Zhang
Yunfan Zhao
Aparna Taneja
Dheeraj M. Nagaraj
Milind Tambe
45
9
0
22 Feb 2024
Low-Rank Modular Reinforcement Learning via Muscle Synergy
Low-Rank Modular Reinforcement Learning via Muscle Synergy
Heng Dong
Tonghan Wang
Jiayuan Liu
Chongjie Zhang
50
17
0
26 Oct 2022
Multi-User Reinforcement Learning with Low Rank Rewards
Multi-User Reinforcement Learning with Low Rank Rewards
Naman Agarwal
Prateek Jain
S. Kowshik
Dheeraj M. Nagaraj
Praneeth Netrapalli
OffRL
25
1
0
11 Oct 2022
NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL
NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL
Khaled Nakhleh
Santosh Ganji
Ping-Chun Hsieh
I.-Hong Hou
S. Shakkottai
53
37
0
05 Oct 2021
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles
Tonghan Wang
Heng Dong
V. Lesser
Chongjie Zhang
51
208
0
18 Mar 2020
1