98
1

Fast rates for noisy clustering

Abstract

The effect of errors in variables in empirical minimization is investigated. Given a loss ll and a set of decision rules G\mathcal{G}, we prove a general upper bound for an empirical minimization based on a deconvolution kernel and a noisy sample Zi=Xi+ϵi,i=1,...,nZ_i=X_i+\epsilon_i,i=1,...,n. We apply this general upper bound to give the rate of convergence for the expected excess risk in noisy clustering. A recent bound from \citet{levrard} proves that this rate is O(1/n)\mathcal{O}(1/n) in the direct case, under Pollard's regularity assumptions. Here the effect of noisy measurements gives a rate of the form O(1/nγγ+2β)\mathcal{O}(1/n^{\frac{\gamma}{\gamma+2\beta}}), where γ\gamma is the H\"older regularity of the density of XX whereas β\beta is the degree of illposedness.

View on arXiv
Comments on this paper