Estimating Graph Dimension with Cross-validated Eigenvalues
In applied multivariate statistics, estimating the number of latent dimensions or the number of clusters, , is a fundamental and recurring problem. We study a sequence of statistics called "cross-validated eigenvalues." Under a large class of random graph models, including both Poisson and Bernoulli edges, without parametric assumptions, we provide a -value for each cross-validated eigenvalue. It tests the null hypothesis that the sample eigenvector is orthogonal to (i.e., uncorrelated with) the true latent dimensions. This approach naturally adapts to problems where some dimensions are not statistically detectable. In scenarios where all dimensions can be estimated, we show that our procedure consistently estimates . In simulations and data example, the proposed estimator compares favorably to alternative approaches in both computational and statistical performance.
View on arXiv