22
0

Global testing under the sparse alternatives for single index models

Abstract

For the single index model y=f(βτx,ϵ)y=f(\beta^{\tau}x,\epsilon) with Gaussian design, %satisfying that rank var(E[xy])=1var(\mathbb{E}[x\mid y])=1 where ff is unknown and β\beta is a sparse pp-dimensional unit vector with at most ss nonzero entries, we are interested in testing the null hypothesis that β\beta, when viewed as a whole vector, is zero against the alternative that some entries of β\beta is nonzero. Assuming that var(E[xy])var(\mathbb{E}[x \mid y]) is non-vanishing, we define the generalized signal-to-noise ratio (gSNR) λ\lambda of the model as the unique non-zero eigenvalue of var(E[xy])var(\mathbb{E}[x \mid y]). We show that if s2log2(p)ps^{2}\log^2(p)\wedge p is of a smaller order of nn, denoted as s2log2(p)pns^{2}\log^2(p)\wedge p\prec n, where nn is the sample size, one can detect the existence of signals if and only if gSNRp1/2nslog(p)n\succ\frac{p^{1/2}}{n}\wedge \frac{s\log(p)}{n}. Furthermore, if the noise is additive (i.e., y=f(βτx)+ϵy=f(\beta^{\tau}x)+\epsilon), one can detect the existence of the signal if and only if gSNRp1/2nslog(p)n1n\succ\frac{p^{1/2}}{n}\wedge \frac{s\log(p)}{n} \wedge \frac{1}{\sqrt{n}}. It is rather surprising that the detection boundary for the single index model with additive noise matches that for linear regression models. These results pave the road for thorough theoretical analysis of single/multiple index models in high dimensions.

View on arXiv
Comments on this paper