Kernel Density Estimation with Linked Boundary Conditions

Kernel density estimation on a finite interval poses an outstanding challenge because of the well-recognized boundary bias issues at the end-points. Motivated by an application of density estimation in biology, we consider a new type of boundary constraint, in which the values of the estimator at the two boundary points are linked. We provide a kernel density estimator that successfully incorporates this linked boundary condition, leading to a non-self-adjoint diffusion process. The kernel is a non-symmetric heat kernel which generates a series expansion in non-separable generalized eigenfunctions of the spatial differential operator. This is analyzed and solved through the unified transform, giving rise to an integral representation of the solution in the complex plane, allowing us to rigorously develop the theory of such estimators. We apply our method to our motivating example in biology and provide numerical experiments with synthetic data including comparisons with state-of-the-art kernel density estimators. The experiments suggest that the method is fast and easy to use and compares favorably in both accuracy and speed with existing methods which are currently unable to handle such linked boundary constraints. The analysis presented here can also be extended to more general types of boundary conditions that may be encountered in applications.
View on arXiv