227

Concentration Based Inference in High Dimensional Generalized Regression Models (I: Statistical Guarantees)

Abstract

We develop simple and non-asymptotically justified methods for hypothesis testing about the coefficients (θRp\theta^{*}\in\mathbb{R}^{p}) in the high dimensional generalized regression models where pp can exceed the sample size. Given a function h:RpRmh:\,\mathbb{R}^{p}\mapsto\mathbb{R}^{m}, we consider H0:h(θ)=0mH_{0}:\,h(\theta^{*}) = \mathbf{0}_{m} against H1:h(θ)0mH_{1}:\,h(\theta^{*})\neq\mathbf{0}_{m}, where mm can be any integer in [1,p]\left[1,\,p\right] and hh can be nonlinear in θ\theta^{*}. Our test statistics is based on the sample "quasi score" vector evaluated at an estimate θ^α\hat{\theta}_{\alpha} that satisfies h(θ^α)=0mh(\hat{\theta}_{\alpha})=\mathbf{0}_{m}, where α\alpha is the prespecified Type I error. By exploiting the concentration phenomenon in Lipschitz functions, the key component reflecting the dimension complexity in our non-asymptotic thresholds uses a Monte-Carlo approximation to mimic the expectation that is concentrated around and automatically captures the dependencies between the coordinates. We provide probabilistic guarantees in terms of the Type I and Type II errors for the quasi score test. Confidence regions are also constructed for the population quasi-score vector evaluated at θ\theta^{*}. The first set of our results are specific to the standard Gaussian linear regression models; the second set allow for reasonably flexible forms of non-Gaussian responses, heteroscedastic noise, and nonlinearity in the regression coefficients, while only requiring the correct specification of E(YiXi)\mathbb{E}\left(Y_i | X_i\right)s. The novelty of our methods is that their validity does not rely on good behavior of θ^αθ2\left\Vert \hat{\theta}_\alpha - \theta^*\right\Vert_2 (or even n1/2X(θ^αθ)2n^{-1/2}\left\Vert X\left(\hat{\theta}_\alpha - \theta^*\right)\right\Vert_2 in the linear regression case) nonasymptotically or asymptotically.

View on arXiv
Comments on this paper