No-regret Learning in Cournot Games

In this work, we study the interaction of strategic players in continuous action Cournot games with limited information feedback. Cournot game is the essential model for many socio-economic systems where players learn and compete. In addition, in many practical settings these players do not have full knowledge of the system or of each other. In this limited information setting, it becomes important to understand the dynamics and limiting behavior of the players. Specifically, we assume players follow strategies such that in hindsight their payoffs are not exceeded by any single deviating action. Given this no-regret guarantee, we prove that under standard assumptions, the players' joint action (both in the sense of time average and final iteration convergence) converges to the unique Nash equilibrium. In addition, our results naturally extend the existing regret analysis on time average convergence to obtain final iteration convergence rates. Together, our work presents significantly sharper and generalized convergence results, and shows how exploiting the game information feedback can influence the convergence rates.
View on arXiv