356
v1v2v3v4v5 (latest)

Sample Efficient Toeplitz Covariance Estimation

ACM-SIAM Symposium on Discrete Algorithms (SODA), 2019
Abstract

We study the sample complexity of estimating the covariance matrix TT of a distribution D\mathcal{D} over dd-dimensional vectors, under the assumption that TT is Toeplitz. This assumption arises in many signal processing problems, where the covariance between any two measurements only depends on the time or distance between those measurements. We are interested in estimation strategies that may choose to view only a subset of entries in each vector sample xDx \sim \mathcal{D}, which often equates to reducing hardware and communication requirements in applications ranging from wireless signal processing to advanced imaging. Our goal is to minimize both 1) the number of vector samples drawn from D\mathcal{D} and 2) the number of entries accessed in each sample. We provide some of the first non-asymptotic bounds on these sample complexity measures that exploit TT's Toeplitz structure, and by doing so, significantly improve on results for generic covariance matrices. Our bounds follow from a novel analysis of classical and widely used estimation algorithms (along with some new variants), including methods based on selecting entries from each vector sample according to a so-called sparse ruler. In many cases, we pair our upper bounds with matching or nearly matching lower bounds. In addition to results that hold for any Toeplitz TT, we further study the important setting when TT is close to low-rank, which is often the case in practice. We show that methods based on sparse rulers perform even better in this setting, with sample complexity scaling sublinearly in dd. Motivated by this finding, we develop a new covariance estimation strategy that further improves on all existing methods in the low-rank case: when TT is rank-kk or nearly rank-kk, it achieves sample complexity depending polynomially on kk and only logarithmically on dd.

View on arXiv
Comments on this paper