13
4

Dynamic Maintenance of Kernel Density Estimation Data Structure: From Practice to Theory

Abstract

Kernel density estimation (KDE) stands out as a challenging task in machine learning. The problem is defined in the following way: given a kernel function f(x,y)f(x,y) and a set of points {x1,x2,,xn}Rd\{x_1, x_2, \cdots, x_n \} \subset \mathbb{R}^d, we would like to compute 1ni=1nf(xi,y)\frac{1}{n}\sum_{i=1}^{n} f(x_i,y) for any query point yRdy \in \mathbb{R}^d. Recently, there has been a growing trend of using data structures for efficient KDE. However, the proposed KDE data structures focus on static settings. The robustness of KDE data structures over dynamic changing data distributions is not addressed. In this work, we focus on the dynamic maintenance of KDE data structures with robustness to adversarial queries. Especially, we provide a theoretical framework of KDE data structures. In our framework, the KDE data structures only require subquadratic spaces. Moreover, our data structure supports the dynamic update of the dataset in sublinear time. Furthermore, we can perform adaptive queries with the potential adversary in sublinear time.

View on arXiv
Comments on this paper