56
6

Two-step estimation of high dimensional additive models

Abstract

This paper investigates the two-step estimation of a high dimensional additive regression model, in which the number of nonparametric additive components is potentially larger than the sample size but the number of significant additive components is sufficiently small. The approach investigated consists of two steps. The first step implements the variable selection, typically by the group Lasso, and the second step applies the penalized least squares estimation with Sobolev penalties to the selected additive components. Such a procedure is computationally simple to implement and, in our numerical experiments, works reasonably well. Despite its intuitive nature, the theoretical properties of this two-step procedure have to be carefully analyzed, since the effect of the first step variable selection is random, and generally it may contain redundant additive components and at the same time miss significant additive components. This paper derives a generic performance bound on the two-step estimation procedure allowing for these situations, and studies in detail the overall performance when the first step variable selection is implemented by the group Lasso.

View on arXiv
Comments on this paper