151

M-estimation with the Trimmed l1 Penalty

Abstract

We study high-dimensional M-estimators with the trimmed 1\ell_1 penalty. While standard 1\ell_1 penalty incurs bias (shrinkage), trimmed 1\ell_1 leaves the hh largest entries penalty-free. This family of estimators include the Trimmed Lasso for sparse linear regression and its counterpart for sparse graphical model estimation. The trimmed 1\ell_1 penalty is non-convex, but unlike other non-convex regularizers such as SCAD and MCP, it is not amenable and therefore prior analyzes cannot be applied. We characterize the support recovery of the estimates as a function of the trimming parameter hh. Under certain conditions, we show that for any local optimum, (i) if the trimming parameter hh is smaller than the true support size, all zero entries of the true parameter vector are successfully estimated as zero, and (ii) if hh is larger than the true support size, the non-relevant parameters of the local optimum have smaller absolute values than relevant parameters and hence relevant parameters are not penalized. We then bound the 2\ell_2 error of any local optimum. These bounds are asymptotically comparable to those for non-convex amenable penalties such as SCAD or MCP, but enjoy better constants. We specialize our main results to linear regression and graphical model estimation. Finally, we develop a fast provably convergent optimization algorithm for the trimmed regularizer problem. The algorithm has the same rate of convergence as difference of convex (DC)-based approaches, but is faster in practice and finds better objective values than recently proposed algorithms for DC optimization. Empirical results further demonstrate the value of 1\ell_1 trimming.

View on arXiv
Comments on this paper