100

Approximating the covariance ellipsoid

Abstract

We explore ways in which the covariance ellipsoid B={vRd:E<X,v>21}{\cal B}=\{v \in \mathbb{R}^d : \mathbb{E} <X,v>^2 \leq 1\} of a centred random vector XX in Rd\mathbb{R}^d can be approximated by a simple set. The data one is given for constructing the approximating set consists of X1,...,XNX_1,...,X_N that are independent and distributed as XX. We present a general method that can be used to construct such approximations and implement it for two types of approximating sets. We first construct a (random) set K{\cal K} defined by a union of intersections of slabs Hz,α={vRd:<z,v>α}H_{z,\alpha}=\{v \in \mathbb{R}^d : |<z,v>| \leq \alpha\} (and therefore K{\cal K} is actually the output of a neural network with two hidden layers). The slabs are generated using X1,...,XNX_1,...,X_N, and under minimal assumptions on XX (e.g., XX can be heavy-tailed) it suffices that N=c1dη4log(2/η)N = c_1d \eta^{-4}\log(2/\eta) to ensure that (1η)KB(1+η)K(1-\eta) {\cal K} \subset {\cal B} \subset (1+\eta){\cal K}. In some cases (e.g., if XX is rotation invariant and has marginals that are well behaved in some weak sense), a smaller sample size suffices: N=c1dη2log(2/η)N = c_1d\eta^{-2}\log(2/\eta). We then show that if the slabs are replaced by randomly generated ellipsoids defined using X1,...,XNX_1,...,X_N, the same degree of approximation is true when Nc2dη2log(2/η)N \geq c_2d\eta^{-2}\log(2/\eta). The construction we use is based on the small-ball method.

View on arXiv
Comments on this paper