Tractable Bayesian variable selection: beyond normality

Bayesian variable selection for continuous outcomes often assumes normality. There are sound reasons behind this assumption, particularly for large p: ease of interpretation, analytical and computational convenience. More flexible frameworks exist, including semi- or non-parametric models, often at the cost of losing some tractability. We propose a simple extension of the Normal model that allows for skewness and thicker-than-normal tails but preserves its tractability. We show that a classical strategy for asymmetric Normal and Laplace errors via two-piece distributions leads to easy interpretation and a log-concave likelihood that greatly facilitates optimization and integration. We characterize asymptotically parameter estimation and Bayes factor rates, in particular studying the effects of model misspecification using an M-estimation framework. Under suitable conditions misspecified Bayes factors are consistent and induce sparsity at the same asymptotic rates than under the correct model. However, the rates to detect signal are altered by an exponential factor, often resulting in a loss of sensitivity. These deficiencies can be ameliorated by inferring the error distribution from the data. This is simple but can lead to substantial improvements in inference. Our work focuses on the likelihood and can thus be combined with any likelihood penalty or prior, but here we focus mostly on non-local priors to induce extra sparsity. Our results highlight the practical importance of specifying the likelihood, as opposed to focusing solely on the prior, when it comes to Bayesian variable selection. The methodology is available as part of R package `mombf'.
View on arXiv