Nonparametric inference about mean functionals of nonignorable nonresponse data without identifying the joint distribution

We consider identification and inference about mean functionals of observed covariates and an outcome variable subject to nonignorable missingness. By leveraging a shadow variable, we establish a necessary and sufficient condition for identification of the mean functional even if the full data distribution is not identified. We further characterize a necessary condition for -estimability of the mean functional. This condition naturally strengthens the identifying condition, and it requires the existence of a function as a solution to a representer equation that connects the shadow variable to the mean functional. Solutions to the representer equation may not be unique, which presents substantial challenges for nonparametric estimation and standard theories for nonparametric sieve estimators are not applicable here. We construct a consistent estimator for the solution set and then adapt the theory of extremum estimators to find from the estimated set a consistent estimator for an appropriately chosen solution. The estimator is asymptotically normal, locally efficient and attains the semiparametric efficiency bound under certain regularity conditions. We illustrate the proposed approach via simulations and a real data application on home pricing.
View on arXiv