Global testing under the sparse alternatives for single index models

4 May 2018

Abstract

For the single index model $y=f(\beta^{\tau}x,\epsilon)$ with Gaussian design, %satisfying that rank $var(\mathbb{E}[x\mid y])=1$ where $f$ is unknown and $\beta$ is a sparse $p$ -dimensional unit vector with at most $s$ nonzero entries, we are interested in testing the null hypothesis that $\beta$ , when viewed as a whole vector, is zero against the alternative that some entries of $\beta$ is nonzero. Assuming that $var(\mathbb{E}[x \mid y])$ is non-vanishing, we define the generalized signal-to-noise ratio (gSNR) $\lambda$ of the model as the unique non-zero eigenvalue of $var(\mathbb{E}[x \mid y])$ . We show that if $s^{2}\log^2(p)\wedge p$ is of a smaller order of $n$ , denoted as $s^{2}\log^2(p)\wedge p\prec n$ , where $n$ is the sample size, one can detect the existence of signals if and only if gSNR $\succ\frac{p^{1/2}}{n}\wedge \frac{s\log(p)}{n}$ . Furthermore, if the noise is additive (i.e., $y=f(\beta^{\tau}x)+\epsilon$ ), one can detect the existence of the signal if and only if gSNR $\succ\frac{p^{1/2}}{n}\wedge \frac{s\log(p)}{n} \wedge \frac{1}{\sqrt{n}}$ . It is rather surprising that the detection boundary for the single index model with additive noise matches that for linear regression models. These results pave the road for thorough theoretical analysis of single/multiple index models in high dimensions.

View on arXiv

Comments on this paper