247

Competing Bandits: Learning under Competition

Information Technology Convergence and Services (ITCS), 2017
Abstract

Most modern systems strive to learn from interactions with users, and many engage in \emph{exploration}: making potentially suboptimal choices for the sake of acquiring new information. We initiate a study of the interplay between \emph{exploration and competition}---how such systems balance the exploration for learning and the competition for users. Here the users play three distinct roles: they are customers that generate revenue, they are sources of data for learning, and they are self-interested agents which choose among the competing systems. As a model, we consider competition between two multi-armed bandit algorithms faced with the same bandit instance. Users arrive one by one and choose among the two algorithms, so that each algorithm makes progress if and only if it is chosen. We ask whether and to which extent competition incentivizes \emph{innovation}: adoption of better algorithms. We investigate this issue for several models of user response, as we vary the degree of rationality and competitiveness in the model. Effectively, we map out the "competition vs. innovation" relationship, a well-studied theme in economics.

View on arXiv
Comments on this paper