Generalized Principal Components for Panel Data and Factor Models

While most of the convergence results in the literature on high-dimensional covariance matrix are concerned about the accuracy of estimating the covariance matrix (and precision matrix), little is known about the effect of estimating large covariances on statistical inferences. We study two important models: factor analysis and panel data model with interactive effects, and focus on the statistical inference and estimation efficiency of structural parameters based on large covariance estimators. It is known that in high-dimensional factor analysis and panel data models, the regular principle components (PC) estimator does not efficiently estimate the parameters of the model. This paper proposes a method of generalized principle components (GPC), which relies on a high-dimensional weight matrix. Three important weights are compared: the identity matrix that gives the regular PC estimator, the diagonal matrix with inverse cross-sectional variances that gives the heteroskedastic estimator, and the precision matrix of the error covariance that gives rise to the efficient GPC. We derive the inferential theory for the general GPC estimators, and employ a high-dimensional inverse covariance estimator to define a feasible efficient GPC. It is shown that the feasible efficient GPC is optimal over a broad class of estimators for the approximate factor model. We illustrate that most of the existing results on large covariance matrix estimation based on absolute convergences are too restrictive for statistical inferences. Instead, a new technical strategy of the weighted consistency for estimating the optimal weight matrix is developed. Finally, numerical results demonstrate that the proposed methods perform well under both heteroskedasticity and error cross-sectional correlations.
View on arXiv