270

Coresets For Monotonic Functions with Applications to Deep Learning

Abstract

Coreset (or core-set) in this paper is a small weighted \emph{subset} QQ of the input set PP with respect to a given \emph{monotonic} function f:RRf:\mathbb{R}\to\mathbb{R} that \emph{provably} approximates its fitting loss pPf(px)\sum_{p\in P}f(p\cdot x) to \emph{any} given xRdx\in\mathbb{R}^d. Using QQ we can obtain approximation to xx^* that minimizes this loss, by running \emph{existing} optimization algorithms on QQ. We provide: (i) a lower bound that proves that there are sets with no coresets smaller than n=Pn=|P| , (ii) a proof that a small coreset of size near-logarithmic in nn exists for \emph{any} input PP, under natural assumption that holds e.g. for logistic regression and the sigmoid activation function. (iii) a generic algorithm that computes QQ in O(nd+nlogn)O(nd+n\log n) expected time, (iv) novel technique for improving existing deep networks using such coresets, (v) extensive experimental results with open code.oving existing deep networks using such coresets, (v) extensive experimental results with open code.

View on arXiv
Comments on this paper