Exchangeable Trait Allocations

Clustering requires placing data into mutually exclusive groups, while feature allocations allow each datum to exhibit binary membership in multiple groups. But often, data points can not only belong to multiple groups but have different levels of belonging in each group. We refer to the corresponding relaxation of these combinatorial structures as a "trait allocation." The exchangeable partition probability function (EPPF) allows for practical inference in clustering models, and the Kingman paintbox provides a representation for clustering that allows us to study all exchangeable clustering models at once. We provide the analogous exchangeable trait probability function (ETPF) and paintbox representation for trait allocations, along with a characterization of all trait allocations with an ETPF. Our proofs avoid the unnecessary auxiliary randomness of previous specialized constructions and---unlike previous feature allocation characterizations---fully capture single-occurrence "dust" groups. We further introduce a novel constrained version of the ETPF that we use to establish the first direct connection between the probability functions for clustering, feature allocations, and trait allocations. As an application of our general theory, we characterize the distribution of all edge-exchangeable graphs, a recently-developed model that captures realistic sparse graph sequences.
View on arXiv