20
0

Bayesian Collaborative Bandits with Thompson Sampling for Improved Outreach in Maternal Health Program

Abstract

Mobile health (mHealth) programs face a critical challenge in optimizing the timing of automated health information calls to beneficiaries. This challenge has been formulated as a collaborative multi-armed bandit problem, requiring online learning of a low-rank reward matrix. Existing solutions often rely on heuristic combinations of offline matrix completion and exploration strategies. In this work, we propose a principled Bayesian approach using Thompson Sampling for this collaborative bandit problem. Our method leverages prior information through efficient Gibbs sampling for posterior inference over the low-rank matrix factors, enabling faster convergence. We demonstrate significant improvements over state-of-the-art baselines on a real-world dataset from the world's largest maternal mHealth program. Our approach achieves a 16%16\% reduction in the number of calls compared to existing methods and a 4747\% reduction compared to the deployed random policy. This efficiency gain translates to a potential increase in program capacity by 0.51.40.5-1.4 million beneficiaries, granting them access to vital ante-natal and post-natal care information. Furthermore, we observe a 7%7\% and 29%29\% improvement in beneficiary retention (an extremely hard metric to impact) compared to state-of-the-art and deployed baselines, respectively. Synthetic simulations further demonstrate the superiority of our approach, particularly in low-data regimes and in effectively utilizing prior information. We also provide a theoretical analysis of our algorithm in a special setting using Eluder dimension.

View on arXiv
Comments on this paper