24
19

Non-Parametric Estimation of Manifolds from Noisy Data

Abstract

A common observation in data-driven applications is that high dimensional data has a low intrinsic dimension, at least locally. In this work, we consider the problem of estimating a dd dimensional sub-manifold of RD\mathbb{R}^D from a finite set of noisy samples. Assuming that the data was sampled uniformly from a tubular neighborhood of MCk\mathcal{M}\in \mathcal{C}^k, a compact manifold without boundary, we present an algorithm that takes a point rr from the tubular neighborhood and outputs p^nRD\hat p_n\in \mathbb{R}^D, and Tp^nM^\widehat{T_{\hat p_n}\mathcal{M}} an element in the Grassmanian Gr(d,D)Gr(d, D). We prove that as the number of samples nn\to\infty the point p^n\hat p_n converges to pMp\in \mathcal{M} and Tp^nM^\widehat{T_{\hat p_n}\mathcal{M}} converges to TpMT_p\mathcal{M} (the tangent space at that point) with high probability. Furthermore, we show that the estimation yields asymptotic rates of convergence of nk2k+dn^{-\frac{k}{2k + d}} for the point estimation and nk12k+dn^{-\frac{k-1}{2k + d}} for the estimation of the tangent space. These rates are known to be optimal for the case of function estimation.

View on arXiv
Comments on this paper