Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture Models

We consider a high-dimensional mean estimation problem over a binary hidden Markov model, which illuminates the interplay between memory in data, sample size, dimension, and signal strength in statistical inference. In this model, an estimator observes samples of a -dimensional parameter vector , multiplied by a random sign (), and corrupted by isotropic standard Gaussian noise. The sequence of signs is drawn from a stationary homogeneous Markov chain with flip probability . As varies, this model smoothly interpolates two well-studied models: the Gaussian Location Model for which and the Gaussian Mixture Model for which . Assuming that the estimator knows , we establish a nearly minimax optimal (up to logarithmic factors) estimation error rate, as a function of . We then provide an upper bound to the case of estimating , assuming a (possibly inaccurate) knowledge of . The bound is proved to be tight when is an accurately known constant. These results are then combined to an algorithm which estimates with unknown a priori, and theoretical guarantees on its error are stated.
View on arXiv