Distributed Information-Theoretic Biclustering

15 February 2016

Abstract

We study a novel multi-terminal source coding setup motivated by the biclustering problem. Two separate encoders observe two i.i.d. sources $X^n$ and $Z^n$ , respectively. The goal is to find rate-limited encodings $f(x^n)$ and $g(z^n)$ that maximize the mutual information $I(f(X^n); g(Z^n))/n$ . There are strong ties to a number of information theoretic problems, including hypothesis testing against independence, pattern recognition, the information bottleneck method and lossy source coding with logarithmic-loss distortion. Improving previous cardinality bounds allows us to thoroughly study the example of a binary symmetric source and quantifying the gap between the inner and the outer bound in this special case. Furthermore, we generalize our results to the case of more than two i.i.d. sources. As a special case of this generalization we investigate a Multiple Description extension of the CEO problem with log-loss distortion. Surprisingly this MD-CEO problem permits a tight single-letter characterization of the achievable region, which has the remarkable feature that it allows exploiting rates that are in general insufficient to guarantee successful typicality decoding.

View on arXiv

Comments on this paper