177
v1v2v3 (latest)

Multi-Objective min-max\textit{min-max} Online Convex Optimization

Main:11 Pages
5 Figures
Bibliography:4 Pages
Appendix:31 Pages
Abstract

In this paper, we broaden the horizon of online convex optimization (OCO), and consider multi-objective OCO, where there are KK distinct loss function sequences, and an algorithm has to choose its action at time tt, before the KK loss functions at time tt are revealed. To capture the tradeoff between tracking the KK different sequences, we consider the {\it min-max} regret, where the benchmark (optimal offline algorithm) takes a static action across all time slots that minimizes the maximum of the total loss (summed across time slots) incurred by each of the KK sequences. An online algorithm is allowed to change its action across time slots, and its {\it min-max} regret is defined as the difference between its {\it min-max} cost and that of the benchmark. The {\it min-max} regret is a stringent performance measure and an algorithm with small regret needs to `track' all loss functions simultaneously.We first show that with adversarial input, {\it min-max} regret scales linearly with the time horizon TT for any online algorithm. Consequently, we consider a stochastic i.i.d. input model where all loss functions are i.i.d. generated from an unknown joint distribution and propose a simple algorithm that combines the well-known {\it Hedge} and online gradient descent (OGD) and show via a remarkably simple proof that its expected {\it min-max} regret is O(Tlog(TK))O(\sqrt{T \log (T K)}). Analogous results are also derived for Martingale difference and Markov input models.

View on arXiv
Comments on this paper