21
15

On the Non-Asymptotic Concentration of Heteroskedastic Wishart-type Matrix

Abstract

This paper focuses on the non-asymptotic concentration of the heteroskedastic Wishart-type matrices. Suppose ZZ is a p1p_1-by-p2p_2 random matrix and ZijN(0,σij2)Z_{ij} \sim N(0,\sigma_{ij}^2) independently, we prove the expected spectral norm of Wishart matrix deviations (i.e., EZZEZZ\mathbb{E} \left\|ZZ^\top - \mathbb{E} ZZ^\top\right\|) is upper bounded by \begin{equation*} \begin{split} (1+\epsilon)\left\{2\sigma_C\sigma_R + \sigma_C^2 + C\sigma_R\sigma_*\sqrt{\log(p_1 \wedge p_2)} + C\sigma_*^2\log(p_1 \wedge p_2)\right\}, \end{split} \end{equation*} where σC2:=maxji=1p1σij2\sigma_C^2 := \max_j \sum_{i=1}^{p_1}\sigma_{ij}^2, σR2:=maxij=1p2σij2\sigma_R^2 := \max_i \sum_{j=1}^{p_2}\sigma_{ij}^2 and σ2:=maxi,jσij2\sigma_*^2 := \max_{i,j}\sigma_{ij}^2. A minimax lower bound is developed that matches this upper bound. Then, we derive the concentration inequalities, moments, and tail bounds for the heteroskedastic Wishart-type matrix under more general distributions, such as sub-Gaussian and heavy-tailed distributions. Next, we consider the cases where ZZ has homoskedastic columns or rows (i.e., σijσi\sigma_{ij} \approx \sigma_i or σijσj\sigma_{ij} \approx \sigma_j) and derive the rate-optimal Wishart-type concentration bounds. Finally, we apply the developed tools to identify the sharp signal-to-noise ratio threshold for consistent clustering in the heteroskedastic clustering problem.

View on arXiv
Comments on this paper