Multi-Objective Online Convex Optimization
In this paper, we broaden the horizon of online convex optimization (OCO), and consider multi-objective OCO, where there are distinct loss function sequences, and an algorithm has to choose its action at time , before the loss functions at time are revealed. To capture the tradeoff between tracking the different sequences, we consider the {\it min-max} regret, where the benchmark (optimal offline algorithm) takes a static action across all time slots that minimizes the maximum of the total loss (summed across time slots) incurred by each of the sequences. An online algorithm is allowed to change its action across time slots, and its {\it min-max} regret is defined as the difference between its {\it min-max} cost and that of the benchmark. The {\it min-max} regret is a stringent performance measure and an algorithm with small regret needs to `track' all loss functions simultaneously.We first show that with adversarial input, {\it min-max} regret scales linearly with the time horizon for any online algorithm. Consequently, we consider a stochastic i.i.d. input model where all loss functions are i.i.d. generated from an unknown joint distribution and propose a simple algorithm that combines the well-known {\it Hedge} and online gradient descent (OGD) and show via a remarkably simple proof that its expected {\it min-max} regret is . Analogous results are also derived for Martingale difference and Markov input models.
View on arXiv