28
4

Online Linear Regression in Dynamic Environments via Discounting

Abstract

We develop algorithms for online linear regression which achieve optimal static and dynamic regret guarantees \emph{even in the complete absence of prior knowledge}. We present a novel analysis showing that a discounted variant of the Vovk-Azoury-Warmuth forecaster achieves dynamic regret of the form RT(u)O(dlog(T)dPTγ(u)T)R_{T}(\vec{u})\le O\left(d\log(T)\vee \sqrt{dP_{T}^{\gamma}(\vec{u})T}\right), where PTγ(u)P_{T}^{\gamma}(\vec{u}) is a measure of variability of the comparator sequence, and show that the discount factor achieving this result can be learned on-the-fly. We show that this result is optimal by providing a matching lower bound. We also extend our results to \emph{strongly-adaptive} guarantees which hold over every sub-interval [a,b][1,T][a,b]\subseteq[1,T] simultaneously.

View on arXiv
Comments on this paper