11
21

Online Lewis Weight Sampling

Abstract

The seminal work of Cohen and Peng introduced Lewis weight sampling to the theoretical computer science community, yielding fast row sampling algorithms for approximating dd-dimensional subspaces of p\ell_p up to (1+ϵ)(1+\epsilon) error. Several works have extended this important primitive to other settings, including the online coreset and sliding window models. However, these results are only for p{1,2}p\in\{1,2\}, and results for p=1p=1 require a suboptimal O~(d2/ϵ2)\tilde O(d^2/\epsilon^2) samples. In this work, we design the first nearly optimal p\ell_p subspace embeddings for all p(0,)p\in(0,\infty) in the online coreset and sliding window models. In both models, our algorithms store O~(d1(p/2)/ϵ2)\tilde O(d^{1\lor(p/2)}/\epsilon^2) rows. This answers a substantial generalization of the main open question of [BDMMUWZ2020], and gives the first results for all p{1,2}p\notin\{1,2\}. Towards our result, we give the first analysis of "one-shot'' Lewis weight sampling of sampling rows proportionally to their Lewis weights, with sample complexity O~(dp/2/ϵ2)\tilde O(d^{p/2}/\epsilon^2) for p>2p>2. Previously, this scheme was only known to have sample complexity O~(dp/2/ϵ5)\tilde O(d^{p/2}/\epsilon^5), whereas O~(dp/2/ϵ2)\tilde O(d^{p/2}/\epsilon^2) is known if a more sophisticated recursive sampling is used. The recursive sampling cannot be implemented online, thus necessitating an analysis of one-shot Lewis weight sampling. Our analysis uses a novel connection to online numerical linear algebra. As an application, we obtain the first one-pass streaming coreset algorithms for (1+ϵ)(1+\epsilon) approximation of important generalized linear models, such as logistic regression and pp-probit regression. Our upper bounds are parameterized by a complexity parameter μ\mu introduced by [MSSW2018], and we show the first lower bounds showing that a linear dependence on μ\mu is necessary.

View on arXiv
Comments on this paper