Influence of Recommender Systems on Users: A Dynamical Systems Analysis
We analyze the unintended effects that recommender systems have on the preferences of users that they are learning. We consider a contextual multi-armed bandit recommendation algorithm that learns optimal product recommendations based on user and product attributes. It is well known that the sequence of recommendations affects user preferences. However, typical learning algorithms treat the user attributes as static and disregard the impact of their recommendations on user preferences. Our interest is to analyze the effect of this mismatch between the model assumption of a static environment and the reality of an evolving environment affected by the recommendations. To perform this analysis, we introduce a model for the coupled evolution of a linear bandit recommendation system and its users, whose preferences are drawn towards the recommendations made by the algorithm. We describe a method, that is grounded in stochastic approximation theory, to come up with a dynamical system model that asymptotically approximates the mean behavior of the stochastic model. The resulting dynamical system captures the coupled evolution of the population preferences and the learning algorithm. Analyzing this dynamical system gives insight into the long-term properties of user preferences and the learning algorithm. Under certain conditions, we show that the RS is able to learn the population preferences in spite of the model mismatch. We discuss and characterize the relation between various parameters of the model and the long term preferences of users in this work. A key observation is that the exploration-exploitation tradeoff used by the recommendation algorithm significantly affects the long term preferences of users. Algorithms that exploit more can polarize user preferences, leading to the well-known filter bubble phenomenon.
View on arXiv