v1v2 (latest)

Near-Linear MIR Algorithms for Stochastically-Ordered Priors

Algorithmic Game Theory (AGT), 2025

18 February 2025

Main:15 Pages

4 Figures

Bibliography:2 Pages

Appendix:24 Pages

Abstract

With the rise of online applications, recommender systems (RSs) often encounter constraints in balancing exploration and exploitation. Such constraints arise when exploration is carried out by agents whose utility must be taken into account when optimizing overall welfare. A recent work by Bahar et al. (2020) suggests that recommendations should be \emph{mechanism-informed individually rational} (MIR). Specifically, if agents have a default arm they would use, relying on the RS should yield each agent at least the reward of the default arm, conditioned on the information available to the RS. Under the MIR constraint, striking a balance between exploration and exploitation becomes a complex planning problem. To that end, Bahar et al. propose an approximately optimal yet inefficient planning algorithm that runs in $O(2^K K^2 H^2)$ , where $K$ is the number of arms and $H$ is the size of the support of the reward distributions. In this paper, we make a significant improvement for a special yet practical case, removing both the dependence on $H$ and the exponential dependence on $K$ . We assume a stochastic order of the rewards (e.g., Gaussian with unit variance, Bernoulli, etc.), and devise an asymptotically optimal algorithm with a runtime of $O(K \log K)$ . Our technique is based on formulating a Goal Markov Decision Process (GMDP), establishing an optimal dynamic programming procedure, and then unveiling its crux -- fleshing out a simple index-based structure that facilitates efficient computation. Additionally, we present an incentive-compatible version of our algorithm.

View on arXiv

Comments on this paper