375
v1v2 (latest)

Long-Context Linear System Identification

International Conference on Learning Representations (ICLR), 2024
Main:9 Pages
4 Figures
Bibliography:5 Pages
Appendix:20 Pages
Abstract

This paper addresses the problem of long-context linear system identification, where the state xtx_t of a dynamical system at time tt depends linearly on previous states xsx_s over a fixed context window of length pp. We establish a sample complexity bound that matches the i.i.d. parametric rate up to logarithmic factors for a broad class of systems, extending previous works that considered only first-order dependencies. Our findings reveal a learning-without-mixing phenomenon, indicating that learning long-context linear autoregressive models is not hindered by slow mixing properties potentially associated with extended context windows. Additionally, we extend these results to (i) shared low-rank representations, where rank-regularized estimators improve the dependence of the rates on the dimensionality, and (ii) misspecified context lengths in strictly stable systems, where shorter contexts offer statistical advantages.

View on arXiv
Comments on this paper