PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation

We propose a likelihood-free method for comparing two distributions given samples from each, with the goal of assessing the quality of generative models. The proposed approach, PQMass, provides a statistically rigorous method for assessing the performance of a single generative model or the comparison of multiple competing models. PQMass divides the sample space into non-overlapping regions and applies chi-squared tests to the number of data samples that fall within each region, giving a p-value that measures the probability that the bin counts derived from two sets of samples are drawn from the same multinomial distribution. PQMass does not depend on assumptions regarding the density of the true distribution, nor does it rely on training or fitting any auxiliary models. We evaluate PQMass on data of various modalities and dimensions, demonstrating its effectiveness in assessing the quality, novelty, and diversity of generated samples. We further show that PQMass scales well to moderately high-dimensional data and thus obviates the need for feature extraction in practical applications.
View on arXiv@article{lemos2025_2402.04355, title={ PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation }, author={ Pablo Lemos and Sammy Sharief and Nikolay Malkin and Salma Salhi and Conner Stone and Laurence Perreault-Levasseur and Yashar Hezaveh }, journal={arXiv preprint arXiv:2402.04355}, year={ 2025 } }