M-estimation with the Trimmed l1 Penalty
We study high-dimensional M-estimators with the trimmed penalty. While standard penalty incurs bias (shrinkage), trimmed leaves the largest entries penalty-free. This family of estimators include the Trimmed Lasso for sparse linear regression and its counterpart for sparse graphical model estimation. The trimmed penalty is non-convex, but unlike other non-convex regularizers such as SCAD and MCP, it is not amenable and therefore prior analyzes cannot be applied. We characterize the support recovery of the estimates as a function of the trimming parameter . Under certain conditions, we show that for any local optimum, (i) if the trimming parameter is smaller than the true support size, all zero entries of the true parameter vector are successfully estimated as zero, and (ii) if is larger than the true support size, the non-relevant parameters of the local optimum have smaller absolute values than relevant parameters and hence relevant parameters are not penalized. We then bound the error of any local optimum. These bounds are asymptotically comparable to those for non-convex amenable penalties such as SCAD or MCP, but enjoy better constants. We specialize our main results to linear regression and graphical model estimation. Finally, we develop a fast provably convergent optimization algorithm for the trimmed regularizer problem. The algorithm has the same rate of convergence as difference of convex (DC)-based approaches, but is faster in practice and finds better objective values than recently proposed algorithms for DC optimization. Empirical results further demonstrate the value of trimming.
View on arXiv