Robust Learning with Private Information
Firms increasingly delegate decisions to learning algorithms in platform markets. Standard algorithms perform well when platform policies are stationary, but firms often face ambiguity about whether policies are stationary or adapt strategically to their behavior. When policies adapt, efficient learning under stationarity may backfire: it may reveal a firm's persistent private information, allowing the platform to personalize terms and extract information rents. We study a repeated screening problem in which an agent with a fixed private type commits ex ante to a learning algorithm, facing ambiguity about the principal's policy. We show that a broad class of standard algorithms, including all no-external-regret algorithms, can be manipulated by adaptive principals and permit asymptotic full surplus extraction. We then construct a misspecification-robust learning algorithm that treats stationarity as a testable hypothesis. It achieves the optimal payoff under stationarity at the minimax-optimal rate, while preventing dynamic rent extraction: against any adaptive principal, each type's long-run utility is at least its utility under the menu that maximizes revenue under the principal's prior.
View on arXiv