277

Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more

Abstract

Coreset (or core-set) in this paper is a small weighted \emph{subset} QQ of the input set PP with respect to a given \emph{monotonic} function ϕ:\REAL\REAL\phi:\REAL\to\REAL that \emph{provably} approximates its fitting loss pPf(px)\sum_{p\in P}f(p\cdot x) to \emph{any} given x\REALdx\in\REAL^d. Using QQ we can obtain approximation of xx^* that minimizes this loss, by running \emph{existing} optimization algorithms on QQ. We provide: (I) a lower bound that proves that there are sets with no coresets smaller than n=Pn=|P| , (II) a proof that a small coreset of size near-logarithmic in nn exists for \emph{any} input PP, under natural assumption that holds e.g. for logistic regression and the sigmoid activation function. (III) a generic algorithm that computes QQ in O(nd+nlogn)O(nd+n\log n) expected time, (IV) extensive experimental results with open code and benchmarks that show that the coresets are even smaller in practice. Existing papers (e.g.[Huggins,Campbell,Broderick 2016]) suggested only specific coresets for specific input sets.

View on arXiv
Comments on this paper