65

Logarithmic Regret and Polynomial Scaling in Online Multi-step-ahead Prediction

Main:4 Pages
1 Figures
Bibliography:2 Pages
Appendix:5 Pages
Abstract

This letter studies the problem of online multi-step-ahead prediction for unknown linear stochastic systems. Using conditional distribution theory, we derive an optimal parameterization of the prediction policy as a linear function of future inputs, past inputs, and past outputs. Based on this characterization, we propose an online least-squares algorithm to learn the policy and analyze its regret relative to the optimal model-based predictor. We show that the online algorithm achieves logarithmic regret with respect to the optimal Kalman filter in the multi-step setting. Furthermore, with new proof techniques, we establish an almost-sure regret bound that does not rely on fixed failure probabilities for sufficiently large horizons NN. Finally, our analysis also reveals that, while the regret remains logarithmic in NN, its constant factor grows polynomially with the prediction horizon HH, with the polynomial order set by the largest Jordan block of eigenvalue 1 in the system matrix.

View on arXiv
Comments on this paper