56
89

Bayesian variable selection for high dimensional generalized linear models: convergence rates of the fitted densities

Abstract

Bayesian variable selection has gained much empirical success recently in a variety of applications when the number KK of explanatory variables (x1,...,xK)(x_1,...,x_K) is possibly much larger than the sample size nn. For generalized linear models, if most of the xjx_j's have very small effects on the response yy, we show that it is possible to use Bayesian variable selection to reduce overfitting caused by the curse of dimensionality KnK\gg n. In this approach a suitable prior can be used to choose a few out of the many xjx_j's to model yy, so that the posterior will propose probability densities pp that are ``often close'' to the true density pp^* in some sense. The closeness can be described by a Hellinger distance between pp and pp^* that scales at a power very close to n1/2n^{-1/2}, which is the ``finite-dimensional rate'' corresponding to a low-dimensional situation. These findings extend some recent work of Jiang [Technical Report 05-02 (2005) Dept. Statistics, Northwestern Univ.] on consistency of Bayesian variable selection for binary classification.

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.