110
20

Means and medians of sets of persistence diagrams

Abstract

The persistence diagram is the fundamental object in topological data analysis. It inherits the stochastic variability of the data we use as input. As such we need to understand how to perform statistics on the space of persistence diagrams. This paper looks at the space of persistence diagrams under a variety of different metrics which are analogous to LpL^p metrics on the space of functions. Using these metrics we can form different cost functions defining different central tendencies and their corresponding measures of variability. This gives us the natural definitions of both the mean and median of a finite number of persistence diagrams. We give a characterization of the mean and the median of an odd number of persistence diagrams. Although we have examples of the mean not being unique nor continuous we prove that generically the mean of sets of persistence diagrams with finitely many off diagonal points is unique. In comparison the sets of persistence diagrams with finitely many off diagonal points which do not have a unique median is of positive measure.

View on arXiv
Comments on this paper