Linear Submodular Maximization with Bandit Feedback

2 July 2024

Abstract

Submodular optimization with bandit feedback has recently been studied in a variety of contexts. In a number of real-world applications such as diversified recommender systems and data summarization, the submodular function exhibits additional linear structure. We consider developing approximation algorithms for the maximization of a submodular objective function $f:2^U\to\mathbb{R}_{\geq 0}$ , where $f=\sum_{i=1}^dw_iF_{i}$ . It is assumed that we have value oracle access to the functions $F_i$ , but the coefficients $w_i$ are unknown, and $f$ can only be accessed via noisy queries. We develop algorithms for this setting inspired by adaptive allocation algorithms in the best-arm identification for linear bandit, with approximation guarantees arbitrarily close to the setting where we have value oracle access to $f$ . Finally, we empirically demonstrate that our algorithms make vast improvements in terms of sample efficiency compared to algorithms that do not exploit the linear structure of $f$ on instances of move recommendation.

View on arXiv

Comments on this paper