Conditional regression for single-index models

23 February 2020

Abstract

The single-index model is a statistical model for intrinsic regression where the responses are assumed to depend on a single yet unknown linear combination of the predictors, allowing to express the regression function as $\mathbb{E} [ Y | X ] = f ( \langle v , X \rangle )$ for some unknown index vector $v$ and link function $f$ . Estimators converging at the $1$ -dimensional min-max rate exist, but their implementation has exponential cost in the ambient dimension. Recent attempts at mitigating the computational cost yield estimators that are computable in polynomial time, but do not achieve the optimal rate. Conditional methods estimate the index vector $v$ by averaging moments of $X$ conditioned on $Y$ , but do not provide generalization bounds on $f$ . In this paper we develop an extensive non-asymptotic analysis of several conditional methods, and propose a new one that combines some benefits of the existing approaches. In particular, we establish $\sqrt{n}$ -consistency for all conditional methods considered. Moreover, we prove that polynomial partitioning estimates achieve the $1$ -dimensional min-max rate for regression of H\"older functions when combined to any $\sqrt{n}$ -consistent index estimator. Overall this yields an estimator for dimension reduction and regression of single-index models that attains statistical and computational optimality, thereby closing the statistical-computational gap for this problem.

View on arXiv

Comments on this paper