Deconvolution estimation of sampling probabilities, and the
Horovitz-Thompson estimator
We elaborate on a deconvolution method, used to estimate the empirical distribution of unknown parameters, as suggested recently by Efron (2013). It is applied to estimating the empirical distribution of the `sampling probabilities' of m sampled items. The estimated empirical distribution is used to modify the Horovitz-Thompson estimator. The performance of the modified Horovitz-Thompson estimator is studied in two examples. In one example the sampling probabilities are estimated based on the number of visits until a response was obtained. The other example is of estimating the number of unseen species; we analayze the familiar problem and related data `how many words did Shakespeare know?'. In the last example the sampling probabilities are estimated based on the number of times the various species/words were observed.
View on arXiv