69
2

Sample Size Dependent Species Models

Mingyuan Zhou
Abstract

Motivated by the fundamental problem of measuring species diversity, this paper introduces the concept of a cluster structure to define an exchangeable cluster probability function that governs the joint distribution of a random count and its exchangeable random partitions. A cluster structure, naturally arising from a completely random measure mixed Poisson process, allows the probability distribution of the random partitions of a subset of a sample to be dependent on the sample size, a distinct and motivated feature that differs it from a partition structure. A generalized negative binomial process model is proposed to generate a cluster structure, where in the prior the number of clusters is finite and Poisson distributed, and the cluster sizes follow a truncated negative binomial distribution. We construct a nonparametric Bayesian estimator of Simpson's index of diversity under the generalized negative binomial process. We illustrate our results through the analysis of two real sequencing count datasets.

View on arXiv
Comments on this paper