23
13

Adapting to Unknown Noise Distribution in Matrix Denoising

Abstract

We consider the problem of estimating an unknown matrix XRm×n\boldsymbol{X}\in {\mathbb R}^{m\times n}, from observations Y=X+W\boldsymbol{Y} = \boldsymbol{X}+\boldsymbol{W} where W\boldsymbol{W} is a noise matrix with independent and identically distributed entries, as to minimize estimation error measured in operator norm. Assuming that the underlying signal X\boldsymbol{X} is low-rank and incoherent with respect to the canonical basis, we prove that minimax risk is equivalent to (mn)/IW(\sqrt{m}\vee\sqrt{n})/\sqrt{I_W} in the high-dimensional limit m,nm,n\to\infty, where IWI_W is the Fisher information of the noise. Crucially, we develop an efficient procedure that achieves this risk, adaptively over the noise distribution (under certain regularity assumptions). Letting X=UΣVT\boldsymbol{X} = \boldsymbol{U}{\boldsymbol{\Sigma}}\boldsymbol{V}^{{\sf T}} --where URm×r\boldsymbol{U}\in {\mathbb R}^{m\times r}, VRn×r\boldsymbol{V}\in{\mathbb R}^{n\times r} are orthogonal, and rr is kept fixed as m,nm,n\to\infty-- we use our method to estimate U\boldsymbol{U}, V\boldsymbol{V}. Standard spectral methods provide non-trivial estimates of the factors U,V\boldsymbol{U},\boldsymbol{V} (weak recovery) only if the singular values of X\boldsymbol{X} are larger than (mn)1/4Var(W11)1/2(mn)^{1/4}{\rm Var}(W_{11})^{1/2}. We prove that the new approach achieves weak recovery down to the the information-theoretically optimal threshold (mn)1/4IW1/2(mn)^{1/4}I_W^{1/2}.

View on arXiv
Comments on this paper