ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.04505
18
0

A Classification View on Meta Learning Bandits

6 April 2025
Mirco Mutti
Jeongyeol Kwon
Shie Mannor
Aviv Tamar
ArXivPDFHTML
Abstract

Contextual multi-armed bandits are a popular choice to model sequential decision-making. E.g., in a healthcare application we may perform various tests to asses a patient condition (exploration) and then decide on the best treatment to give (exploitation). When humans design strategies, they aim for the exploration to be fast, since the patient's health is at stake, and easy to interpret for a physician overseeing the process. However, common bandit algorithms are nothing like that: The regret caused by exploration scales with H\sqrt{H}H​ over HHH rounds and decision strategies are based on opaque statistical considerations. In this paper, we use an original classification view to meta learn interpretable and fast exploration plans for a fixed collection of bandits M\mathbb{M}M. The plan is prescribed by an interpretable decision tree probing decisions' payoff to classify the test bandit. The test regret of the plan in the stochastic and contextual setting scales with O(λ−2Cλ(M)log⁡2(MH))O (\lambda^{-2} C_{\lambda} (\mathbb{M}) \log^2 (MH))O(λ−2Cλ​(M)log2(MH)), being MMM the size of M\mathbb{M}M, λ\lambdaλ a separation parameter over the bandits, and Cλ(M)C_\lambda (\mathbb{M})Cλ​(M) a novel classification-coefficient that fundamentally links meta learning bandits with classification. Through a nearly matching lower bound, we show that Cλ(M)C_\lambda (\mathbb{M})Cλ​(M) inherently captures the complexity of the setting.

View on arXiv
@article{mutti2025_2504.04505,
  title={ A Classification View on Meta Learning Bandits },
  author={ Mirco Mutti and Jeongyeol Kwon and Shie Mannor and Aviv Tamar },
  journal={arXiv preprint arXiv:2504.04505},
  year={ 2025 }
}
Comments on this paper