Non-exponentially weighted aggregation: regret bounds for unbounded loss functions

7 September 2020

Abstract

We tackle the problem of online optimization with a general, possibly unbounded, loss function. It is well known that the exponentially weighted aggregation strategy (EWA) leads to a regret in $\sqrt{T}$ after $T$ steps, under the assumption that the loss is bounded. The online gradient algorithm (OGA) has a regret in $\sqrt{T}$ when the loss is convex and Lipschitz. In this paper, we study a generalized aggregation strategy, where the weights do no longer necessarily depend exponentially on the losses. Our strategy can be interpreted as the minimization of the expected losses plus a penalty term. When the penalty term is the Kullback-Leibler divergence, we obtain EWA as a special case, but using alternative divergences lead to a regret bounds for unbounded, not necessarily convex losses. However, the cost is a worst regret bound in some cases.

View on arXiv

Comments on this paper