It is important to detect a low-dimensional linear dependency in high-dimensional data. We provide a perspective on this problem through studies on the norms of possibly degenerate Gaussian vectors whose dimension is but has a correlation matrix of rank . We find a precise asymptotic upper bound of such extreme values as . This upper bound is shown to be sharp when the entries of the correlation matrix are generated as inner products of i.i.d. uniformly distributed unit vectors. This upper bound also takes on an interesting trichotomy behavior for different ranges of . Based on these results, we propose several methods for high-dimensional inference. The first application is a general hard threshold rule for variable selection in regressions. The second application is a refinement of valid post-selection inference when the size of selected models is restricted. The third application is an inference method for low-dimension linear dependency. One advantage of this approach is that the asymptotics are in the dimensions and but not in the sample size . Thus, the inference can be made even when . Furthermore, the higher the dimension is, the more accurate the inference is.
View on arXiv