Coresets for Data Discretization and Sine Wave Fitting

In the \emph{monitoring} problem, the input is an unbounded stream of integers in , that are obtained from a sensor (such as GPS or heart beats of a human). The goal (e.g., for anomaly detection) is to approximate the points received so far in by a single frequency , e.g. , where , is a feasible set of solutions, and is a given regularization function. For any approximation error , we prove that \emph{every} set of integers has a weighted subset (sometimes called core-set) of cardinality that approximates (for every ) up to a multiplicative factor of . Using known coreset techniques, this implies streaming algorithms using only memory. Our results hold for a large family of functions. Experimental results and open source code are provided.
View on arXiv