391
1

Mixture Decompositions using a Decomposition of the Sample Space

Abstract

We present a scheme for decomposing joint probability distributions of NN binary random variables as mixtures of other distributions: p(x1,...,xN)=i=1mαifi(x1,...,xN)p(x_1,...,x_N)=\sum_{i=1}^m \alpha_i f_i(x_1,...,x_N), where αi0\alpha_i \geq 0, i=1mαi=1\sum_{i=1}^m \alpha_i =1, and the fif_i belong to some exponential family. We characterize subsets of the sample space for which any distribution with support therein can be used as mixture component fif_i from an exponential family. This allows us to derive bounds for the minimal number mm of mixture components from a hierarchy of exponential families which is sufficient to represent any distribution, and bounds for the number of mixture components necessary to represent distributions with arbitrary correlations up to a given order. We show in particular that every distribution pp on {0,1}N\{0,1\}^N can be written as a mixture of mm independent distributions whenever m2N1m\geq 2^{N-1}, and furthermore, that there are distributions which cannot be written as a mixture of less than 2N12^{N-1} independent distributions. We find also that a number 2N(k+1)\sim 2^{N-(k+1)} of mixture components from the exponential family with interaction order kk is sufficient to represent any distribution.

View on arXiv
Comments on this paper