Sketching has emerged as a powerful technique for speeding up problems in numerical linear algebra, such as regression. In the overconstrained regression problem, one is given an matrix , with , as well as an vector , and one wants to find a vector so as to minimize the residual error . Using the sketch and solve paradigm, one first computes and for a randomly chosen matrix , then outputs so as to minimize . The sketch-and-solve paradigm gives a bound on when is well-conditioned. Our main result is that, when is the subsampled randomized Fourier/Hadamard transform, the error behaves as if it lies in a "random" direction within this bound: for any fixed direction , we have with probability that \[ \langle a, x'-x^*\rangle \lesssim \frac{\|a\|_2\|x'-x^*\|_2}{d^{\frac{1}{2}-\gamma}}, \quad (1) \] where are arbitrary constants. This implies is a factor smaller than . It also gives a better bound on the generalization of to new examples: if rows of correspond to examples and columns to features, then our result gives a better bound for the error introduced by sketch-and-solve when classifying fresh examples. We show that not all oblivious subspace embeddings satisfy these properties. In particular, we give counterexamples showing that matrices based on Count-Sketch or leverage score sampling do not satisfy these properties. We also provide lower bounds, both on how small can be, and for our new guarantee (1), showing that the subsampled randomized Fourier/Hadamard transform is nearly optimal.
View on arXiv