Non-Parametric Estimation of Manifolds from Noisy Data

A common observation in data-driven applications is that high dimensional data has a low intrinsic dimension, at least locally. In this work, we consider the problem of estimating a dimensional sub-manifold of from a finite set of noisy samples. Assuming that the data was sampled uniformly from a tubular neighborhood of , a compact manifold without boundary, we present an algorithm that takes a point from the tubular neighborhood and outputs , and an element in the Grassmanian . We prove that as the number of samples the point converges to and converges to (the tangent space at that point) with high probability. Furthermore, we show that the estimation yields asymptotic rates of convergence of for the point estimation and for the estimation of the tangent space. These rates are known to be optimal for the case of function estimation.
View on arXiv