Density Level Set Estimation on Manifolds with DBSCAN

DBSCAN is one of the most popular clustering algorithms amongst practitioners, but it has received comparatively less theoretical treatment. We show that given and its parameters set under appropriate ranges, DBSCAN estimates the connected components of the -density level set (i.e. where is the density). We characterize the regularity of the level set boundaries using parameter and analyze the estimation error under the Hausdorff metric. When the data lies in we obtain an estimation rate of , which matches known lower bounds up to logarithmic factors. When the data lies on an embedded unknown -dimensional manifold in , then we obtain an estimation rate of . Finally, we provide adaptive parameter tuning in order to attain these rates with no a priori knowledge of the intrinsic dimension, density, or .
View on arXiv