6
8

MIND: Inductive Mutual Information Estimation, A Convex Maximum-Entropy Copula Approach

Abstract

We propose a novel estimator of the mutual information between two ordinal vectors xx and yy. Our approach is inductive (as opposed to deductive) in that it depends on the data generating distribution solely through some nonparametric properties revealing associations in the data, and does not require having enough data to fully characterize the true joint distributions Px,yP_{x, y}. Specifically, our approach consists of (i) noting that I(y;x)=I(uy;ux)I\left(y; x\right) = I\left(u_y; u_x\right) where uyu_y and uxu_x are the copula-uniform dual representations of yy and xx (i.e. their images under the probability integral transform), and (ii) estimating the copula entropies h(uy)h\left(u_y\right), h(ux)h\left(u_x\right) and h(uy,ux)h\left(u_y, u_x\right) by solving a maximum-entropy problem over the space of copula densities under a constraint of the type αm=E[ϕm(uy,ux)]\alpha_m = E\left[\phi_m(u_y, u_x)\right]. We prove that, so long as the constraint is feasible, this problem admits a unique solution, it is in the exponential family, and it can be learned by solving a convex optimization problem. The resulting estimator, which we denote MIND, is marginal-invariant, always non-negative, unbounded for any sample size nn, consistent, has MSE rate O(1/n)O(1/n), and is more data-efficient than competing approaches. Beyond mutual information estimation, we illustrate that our approach may be used to mitigate mode collapse in GANs by maximizing the entropy of the copula of fake samples, a model we refer to as Copula Entropy Regularized GAN (CER-GAN).

View on arXiv
Comments on this paper