55
4

Conditional regression for single-index models

Abstract

The single-index model is a statistical model for intrinsic regression where the responses are assumed to depend on a single yet unknown linear combination of the predictors, allowing to express the regression function as E[YX]=f(v,X) \mathbb{E} [ Y | X ] = f ( \langle v , X \rangle ) for some unknown index vector vv and link function ff. Estimators converging at the 11-dimensional min-max rate exist, but their implementation has exponential cost in the ambient dimension. Recent attempts at mitigating the computational cost yield estimators that are computable in polynomial time, but do not achieve the optimal rate. Conditional methods estimate the index vector vv by averaging moments of XX conditioned on YY, but do not provide generalization bounds on ff. In this paper we develop an extensive non-asymptotic analysis of several conditional methods, and propose a new one that combines some benefits of the existing approaches. In particular, we establish n\sqrt{n}-consistency for all conditional methods considered. Moreover, we prove that polynomial partitioning estimates achieve the 11-dimensional min-max rate for regression of H\"older functions when combined to any n\sqrt{n}-consistent index estimator. Overall this yields an estimator for dimension reduction and regression of single-index models that attains statistical and computational optimality, thereby closing the statistical-computational gap for this problem.

View on arXiv
Comments on this paper