ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1204.2353
55
2

Least Absolute Gradient Selector: Statistical Regression via Pseudo-Hard Thresholding

11 April 2012
Kun Yang
ArXivPDFHTML
Abstract

Variable selection in linear models plays a pivotal role in modern statistics. Hard-thresholding methods such as l0l_0l0​ regularization are theoretically ideal but computationally infeasible. In this paper, we propose a new approach, called the LAGS, short for "least absulute gradient selector", to this challenging yet interesting problem by mimicking the discrete selection process of l0l_0l0​ regularization. To estimate β\betaβ under the influence of noise, we consider, nevertheless, the following convex program [\hat{\beta} = \textrm{arg min}\frac{1}{n}\|X^{T}(y - X\beta)\|_1 + \lambda_n\sum_{i = 1}^pw_i(y;X;n)|\beta_i|] λn>0\lambda_n > 0λn​>0 controls the sparsity and wi>0w_i > 0wi​>0 dependent on y,Xy, Xy,X and nnn is the weights on different βi\beta_iβi​; nnn is the sample size. Surprisingly, we shall show in the paper, both geometrically and analytically, that LAGS enjoys two attractive properties: (1) LAGS demonstrates discrete selection behavior and hard thresholding property as l0l_0l0​ regularization by strategically chosen wiw_iwi​, we call this property "pseudo-hard thresholding"; (2) Asymptotically, LAGS is consistent and capable of discovering the true model; nonasymptotically, LAGS is capable of identifying the sparsity in the model and the prediction error of the coefficients is bounded at the noise level up to a logarithmic factor---log⁡p\log plogp, where ppp is the number of predictors. Computationally, LAGS can be solved efficiently by convex program routines for its convexity or by simplex algorithm after recasting it into a linear program. The numeric simulation shows that LAGS is superior compared to soft-thresholding methods in terms of mean squared error and parsimony of the model.

View on arXiv
Comments on this paper