60
4

Functional L-Optimality Subsampling for Massive Data

Abstract

Massive data bring the big challenges of memory and computation to researchers, which can be tackled to some extent by taking subsamples from the full data as a surrogate. For functional data, it is common to collect measurements intensively over their domains, which require more memory and computation time when the sample size is large. The situation would be much worse when the statistical inference is made through bootstrap samples. To the best of our knowledge, there is no work to study the subsampling for the functional linear regression or its generation systematically. In this article, based on the functional L-optimality criterion we propose an optimal subsampling method for the functional linear model. When the response is a discrete or categorical variable, we further extend this subsampling method to the functional generalized linear model. We establish the asymptotic properties of the resultant estimators by the subsampling methods. The finite sample performance of our proposed subsampling methods is investigated by extensive simulation studies. We also apply our proposed subsampling methods to analyze the global climate data and the kidney transplant data. The results from the analysis of these data show that the optimal subsampling methods motivated by the functional L-optimality criterion are much better than the uniform subsampling method and can well approximate the results based on full data.

View on arXiv
Comments on this paper