ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.08880
23
3
v1v2v3v4v5v6v7 (latest)

Sparse (group) learning with Lipschitz loss functions: a unified analysis

20 October 2019
Antoine Dedieu
ArXiv (abs)PDFHTML
Abstract

We study a family of sparse estimators defined as minimizers of some empirical Lipschitz loss function---which include hinge, logistic and quantile regression losses---with a convex, sparse or group-sparse regularization. In particular, we consider the L1-norm on the coefficients, its sorted Slope version, and the Group L1-L2 extension. First, we propose a theoretical framework which simultaneously derives new L2 estimation upper bounds for all three regularization schemes. For L1 and Slope regularizations, our bounds scale as (k∗/n)log⁡(p/k∗)(k^*/n) \log(p/k^*)(k∗/n)log(p/k∗)---n×pn\times pn×p is the size of the design matrix and k∗k^*k∗ the dimension of the theoretical loss minimizer β∗\beta^*β∗---matching the optimal minimax rate achieved for the least-squares case. For Group L1-L2 regularization, our bounds scale as (s∗/n)log⁡(G/s∗)+m∗/n(s^*/n) \log\left( G / s^* \right) + m^* / n(s∗/n)log(G/s∗)+m∗/n---GGG is the total number of groups and m∗m^*m∗ the number of coefficients in the s∗s^*s∗ groups which contain β∗\beta^*β∗---and improve over the least-squares case. We additionally show that when the signal is strongly group-sparse Group L1-L2 is superior to L1 and Slope. Our bounds are achieved both in probability and in expectation, under common assumptions in the literature. Second, we propose an accelerated proximal algorithm which computes the convex estimators studied when the number of variables is of the order of 100,000100,000100,000. We additionally compare their statistical performance of our estimators against standard baselines for settings where the signal is either sparse or group-sparse. Our experiments findings reveal (i) the good empirical performance of L1 and Slope regularizations for sparse binary classification problems, (ii) the superiority of Group L1-L2 regularization for group-sparse classification problems and (iii) the appealing properties of sparse quantile regression estimators for sparse regression problems with heteroscedastic noise.

View on arXiv
Comments on this paper