37
95

A Shrinkage Principle for Heavy-Tailed Data: High-Dimensional Robust Low-Rank Matrix Recovery

Abstract

This paper introduces a simple principle for robust high-dimensional statistical inference via an appropriate shrinkage on the data. This widens the scope of high-dimensional techniques, reducing the moment conditions from sub-exponential or sub-Gaussian distributions to merely bounded second or fourth moment. As an illustration of this principle, we focus on robust estimation of the low-rank matrix Θ\Theta^* from the trace regression model Y=Tr(ΘTX)+ϵY=Tr (\Theta^{*T}X) +\epsilon. It encompasses four popular problems: sparse linear models, compressed sensing, matrix completion and multi-task regression. We propose to apply penalized least-squares approach to appropriately truncated or shrunk data. Under only bounded 2+δ2+\delta moment condition on the response, the proposed robust methodology yields an estimator that possesses the same statistical error rates as previous literature with sub-Gaussian errors. For sparse linear models and multi-tasking regression, we further allow the design to have only bounded fourth moment and obtain the same statistical rates, again, by appropriate shrinkage of the design matrix. As a byproduct, we give a robust covariance matrix estimator and establish its concentration inequality in terms of the spectral norm when the random samples have only bounded fourth moment. Extensive simulations have been carried out to support our theories.

View on arXiv
Comments on this paper