56
5
v1v2 (latest)

σσ-Ridge: group regularized ridge regression via empirical Bayes noise level cross-validation

Abstract

Features in predictive models are not exchangeable, yet common supervised models treat them as such. Here we study ridge regression when the analyst can partition the features into KK groups based on external side-information. For example, in high-throughput biology, features may represent gene expression, protein abundance or clinical data and so each feature group represents a distinct modality. The analyst's goal is to choose optimal regularization parameters λ=(λ1,,λK)\lambda = (\lambda_1, \dotsc, \lambda_K) -- one for each group. In this work, we study the impact of λ\lambda on the predictive risk of group-regularized ridge regression by deriving limiting risk formulae under a high-dimensional random effects model with pnp\asymp n as nn \to \infty. Furthermore, we propose a data-driven method for choosing λ\lambda that attains the optimal asymptotic risk: The key idea is to interpret the residual noise variance σ2\sigma^2, as a regularization parameter to be chosen through cross-validation. An empirical Bayes construction maps the one-dimensional parameter σ\sigma to the KK-dimensional vector of regularization parameters, i.e., σλ^(σ)\sigma \mapsto \widehat{\lambda}(\sigma). Beyond its theoretical optimality, the proposed method is practical and runs as fast as cross-validated ridge regression without feature groups (K=1K=1).

View on arXiv
Comments on this paper