177
289

High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification

Abstract

We provide a unified analysis of the predictive risk of ridge regression and regularized discriminant analysis in a high-dimensional asymptotic regime where p,np, \, n \rightarrow \infty and p/nγ(0,)p/n \rightarrow \gamma \in (0, \, \infty). Our approach allows for dense random effects and for arbitrary covariance Σ\Sigma among the features. For both methods, we show that the predictive risk stabilizes asymptotically to values for which we give exact expressions. We find that the limiting risk depends in an explicit and efficiently computable way on the spectrum of the feature-covariance matrix, the signal strength, and the aspect ratio γ\gamma. In extensive simulations we show that the analytic results are accurate even for moderate problem sizes. Our results reveal some surprising aspects of the behavior of these methods in high dimensions. For example, in regularized discriminant analysis, the angle between the estimated and oracle separating hyperplanes converges to a deterministic positive value that does not in general go to zero even in the strong-signal limit. Our results build on recent advances in random matrix theory, and suggest that this approach can yield useful qualitative insights into the behavior of predictive methods.

View on arXiv
Comments on this paper