47
10

Signed Support Recovery for Single Index Models in High-Dimensions

Abstract

In this paper we study the support recovery problem for single index models Y=f(Xβ,ε)Y=f(\boldsymbol{X}^{\intercal} \boldsymbol{\beta},\varepsilon), where ff is an unknown link function, XNp(0,Ip)\boldsymbol{X}\sim N_p(0,\mathbb{I}_{p}) and β\boldsymbol{\beta} is an ss-sparse unit vector such that βi{±1s,0}\boldsymbol{\beta}_{i}\in \{\pm\frac{1}{\sqrt{s}},0\}. In particular, we look into the performance of two computationally inexpensive algorithms: (a) the diagonal thresholding sliced inverse regression (DT-SIR) introduced by Lin et al. (2015); and (b) a semi-definite programming (SDP) approach inspired by Amini & Wainwright (2008). When s=O(p1δ)s=O(p^{1-\delta}) for some δ>0\delta>0, we demonstrate that both procedures can succeed in recovering the support of β\boldsymbol{\beta} as long as the rescaled sample size κ=nslog(ps)\kappa=\frac{n}{s\log(p-s)} is larger than a certain critical threshold. On the other hand, when κ\kappa is smaller than a critical value, any algorithm fails to recover the support with probability at least 12\frac{1}{2} asymptotically. In other words, we demonstrate that both DT-SIR and the SDP approach are optimal (up to a scalar) for recovering the support of β\boldsymbol{\beta} in terms of sample size.

View on arXiv
Comments on this paper