114

Fast Regression with an \ell_\infty Guarantee

Abstract

Sketching has emerged as a powerful technique for speeding up problems in numerical linear algebra, such as regression. In the overconstrained regression problem, one is given an n×dn \times d matrix AA, with ndn \gg d, as well as an n×1n \times 1 vector bb, and one wants to find a vector x^\hat{x} so as to minimize the residual error Axb2\|Ax-b\|_2. Using the sketch and solve paradigm, one first computes SAS \cdot A and SbS \cdot b for a randomly chosen matrix SS, then outputs x=(SA)Sbx' = (SA)^{\dagger} Sb so as to minimize SAxSb2\|SAx' - Sb\|_2. The sketch-and-solve paradigm gives a bound on xx2\|x'-x^*\|_2 when AA is well-conditioned. Our main result is that, when SS is the subsampled randomized Fourier/Hadamard transform, the error xxx' - x^* behaves as if it lies in a "random" direction within this bound: for any fixed direction aRda\in \mathbb{R}^d, we have with 1dc1 - d^{-c} probability that \[ \langle a, x'-x^*\rangle \lesssim \frac{\|a\|_2\|x'-x^*\|_2}{d^{\frac{1}{2}-\gamma}}, \quad (1) \] where c,γ>0c, \gamma > 0 are arbitrary constants. This implies xx\|x'-x^*\|_{\infty} is a factor d12γd^{\frac{1}{2}-\gamma} smaller than xx2\|x'-x^*\|_2. It also gives a better bound on the generalization of xx' to new examples: if rows of AA correspond to examples and columns to features, then our result gives a better bound for the error introduced by sketch-and-solve when classifying fresh examples. We show that not all oblivious subspace embeddings SS satisfy these properties. In particular, we give counterexamples showing that matrices based on Count-Sketch or leverage score sampling do not satisfy these properties. We also provide lower bounds, both on how small xx2\|x'-x^*\|_2 can be, and for our new guarantee (1), showing that the subsampled randomized Fourier/Hadamard transform is nearly optimal.

View on arXiv
Comments on this paper