Fisher Information and Mutual Information Constraints
We consider the processing of statistical samples by a channel , and characterize how the statistical information from the samples for estimating the parameter can scale with the mutual information or capacity of the channel. We show that if the statistical model has a sub-Gaussian score function, then the trace of the Fisher information matrix for estimating from can scale at most linearly with the mutual information between and . We apply this result to obtain minimax lower bounds in distributed statistical estimation problems, and obtain a tight preconstant for Gaussian mean estimation. We then show how our Fisher information bound can also imply mutual information or Jensen-Shannon divergence based distributed strong data processing inequalities.
View on arXiv