113
93

Minimax-optimal nonparametric regression in high dimensions

Abstract

Minimax L2L_2 risks for high dimensional nonparametric regression are derived under two sparsity assumptions: 1. the true regression surface is a sparse function that depends only on d=O(logn)d=O(\log n) important predictors among a list of pp predictors, with logp=o(n)\log p= o(n); 2. the true regression surface depends on O(n)O(n) predictors but is an additive function where each additive component is sparse but may contain two or more interacting predictors and may have a smoothness level different from other components. Broad range general results are presented to facilitate sharp lower and upper bound calculations on minimax risks in terms of modified packing entropies and covering entropies, and are specialized to spaces of additive functions. For either modeling assumption, a practicable extension of the widely used Bayesian Gaussian process regression method is shown to adaptively attain the optimal minimax rate (up to logn\log n terms) asymptotically as both n,pn,p \to \infty with logp=o(n)\log p = o(n).

View on arXiv
Comments on this paper