65

The Size of a tt-Digest

Abstract

A tt-digest is a compact data structure that allows estimates of quantiles which increased accuracy near q=0q = 0 or q=1q=1. This is done by clustering samples from R\mathbb R subject to a constraint that the number of points associated with any particular centroid is constrained so that the so-called kk-size of the centroid is always 1\le 1. The kk-size is defined using a scale function that maps quantile qq to index kk. This paper provides bounds on the sizes of tt-digests created using any of four known scale functions.

View on arXiv
Comments on this paper