ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 0903.2003
157
144
v1v2v3v4 (latest)

Feature selection in "omics" prediction problems using cat scores and false non-discovery rate control

11 March 2009
M. Ahdesmaki
K. Strimmer
ArXiv (abs)PDFHTML
Abstract

We revisit the problem of feature selection in linear discriminant analysis (LDA), i.e. when features are correlated. First, we introduce a pooled centroids formulation of the multi-class LDA predictor function, in which the relative weights of Mahalanobis-tranformed predictors are given by correlation-adjusted t-scores (cat scores). Second, for feature selection we propose thresholding cat scores by controlling false non-discovery rates (FNDR). We show that contrary to previous claims this FNDR procedures performs very well and similar to ``higher criticism''. Third, training of the classifier function is conducted by plugin of James-Stein shrinkage estimates of correlations and variances, using analytic procedures for choosing regularization parameters. Overall, this results in an effective and computationally inexpensive framework for high-dimensional prediction with natural feature selection. The proposed shrinkage discriminant procedures are implemented in the R package ``sda'' available from the R repository CRAN.

View on arXiv
Comments on this paper