High-Dimensional Learning under ApproximateSparsity with Applications to
Nonsmooth Estimation and Regularized Neural Networks
High-dimensional statistical learning (HDSL) has wide applications in data analysis, operations research, and decision-making. Despite the availability of multiple theoretical frameworks, most existing HDSL schemes stipulate the following two conditions: (a) the sparsity, and (b) the restricted strong convexity (RSC). This paper generalizes both conditions via the use of the folded concave penalty (FCP). More specifically, we consider an M-estimation problem where (i) the (conventional) sparsity is relaxed into the approximate sparsity and (ii) the RSC is completely absent. We show that the FCP-based regularization leads to poly-logarithmic sample complexity; the training data size is only required to be poly-logarithmic in the problem dimensionality. This finding can facilitate the analysis of two important classes of models that are currently less understood: the high-dimensional nonsmooth learning and the (deep) neural networks (NN). For both problems, we show that the poly-logarithmic sample complexity can be maintained. In particular, our results indicate that the generalizability of NNs under over-parameterization can be theoretically ensured with the aid of regularization.
View on arXiv