264

Learning in Structured Stackelberg Games

Main:12 Pages
6 Figures
Bibliography:3 Pages
2 Tables
Appendix:14 Pages
Abstract

We study structured Stackelberg games, in which both players (the leader and the follower) observe contextual information about the state of the world at time of play. The leader plays against one of a finite number of followers, but the follower's type is not known until after the game has ended. Importantly, we assume a fixed relationship between the contextual information and the follower's type, thereby allowing the leader to leverage this additional structure when deciding her strategy. Under this setting, we find that standard learning theoretic measures of complexity do not characterize the difficulty of the leader's learning task. Instead, we introduce a new notion of dimension, the Stackelberg-Littlestone dimension, which we show characterizes the instance-optimal regret of the leader in the online setting. Based on this, we also provide a provably optimal learning algorithm. We extend our results to the distributional setting, where we use two new notions of dimension, the γ\gamma-Stackelberg-Natarajan dimension and γ\gamma-Stackelberg-Graph dimension. We prove that these control the sample complexity lower and upper bounds respectively, and we design a simple, improper algorithm that achieves the upper bound.

View on arXiv
Comments on this paper