Fast Simultaneous Feature Selection and Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2013

10 October 2013

Abstract

Many learning problems need the selection of a given number of features from a larger pool and a classifier trained on the selected features. Penalized methods based on sparsity inducing priors can be slow on large datasets, with millions of observations and features. Moreover, the sparsity priors might introduce a bias in the estimated parameters and a second step is often used to learn an unbiased model. In this paper we propose a novel efficient algorithm that simultaneously selects a desired number of variables and learns a model on them, by optimizing a likelihood with a sparsity constraint. The iterative algorithm alternates parameter updates with tightening the sparsity constraint by gradually removing variables based on a criterion and a schedule. We present a generic approach for optimizing any differentiable loss function and an application to logistic regression with parametric and non-parametric formulations and consistency guarantees. Experiments on real and simulated data show that the proposed method outperforms state of the art methods based on sparsity inducing penalties for both variable selection and prediction while being computationally faster.

View on arXiv

Comments on this paper