ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.10723
33
7

Fast Regression with an ℓ∞\ell_\inftyℓ∞​ Guarantee

30 May 2017
Eric Price
Zhao Song
David P. Woodruff
ArXiv (abs)PDFHTML
Abstract

Sketching has emerged as a powerful technique for speeding up problems in numerical linear algebra, such as regression. In the overconstrained regression problem, one is given an n×dn \times dn×d matrix AAA, with n≫dn \gg dn≫d, as well as an n×1n \times 1n×1 vector bbb, and one wants to find a vector x^\hat{x}x^ so as to minimize the residual error ∥Ax−b∥2\|Ax-b\|_2∥Ax−b∥2​. Using the sketch and solve paradigm, one first computes S⋅AS \cdot AS⋅A and S⋅bS \cdot bS⋅b for a randomly chosen matrix SSS, then outputs x′=(SA)†Sbx' = (SA)^{\dagger} Sbx′=(SA)†Sb so as to minimize ∥SAx′−Sb∥2\|SAx' - Sb\|_2∥SAx′−Sb∥2​. The sketch-and-solve paradigm gives a bound on ∥x′−x∗∥2\|x'-x^*\|_2∥x′−x∗∥2​ when AAA is well-conditioned. Our main result is that, when SSS is the subsampled randomized Fourier/Hadamard transform, the error x′−x∗x' - x^*x′−x∗ behaves as if it lies in a "random" direction within this bound: for any fixed direction a∈Rda\in \mathbb{R}^da∈Rd, we have with 1−d−c1 - d^{-c}1−d−c probability that \[ \langle a, x'-x^*\rangle \lesssim \frac{\|a\|_2\|x'-x^*\|_2}{d^{\frac{1}{2}-\gamma}}, \quad (1) \] where c,γ>0c, \gamma > 0c,γ>0 are arbitrary constants. This implies ∥x′−x∗∥∞\|x'-x^*\|_{\infty}∥x′−x∗∥∞​ is a factor d12−γd^{\frac{1}{2}-\gamma}d21​−γ smaller than ∥x′−x∗∥2\|x'-x^*\|_2∥x′−x∗∥2​. It also gives a better bound on the generalization of x′x'x′ to new examples: if rows of AAA correspond to examples and columns to features, then our result gives a better bound for the error introduced by sketch-and-solve when classifying fresh examples. We show that not all oblivious subspace embeddings SSS satisfy these properties. In particular, we give counterexamples showing that matrices based on Count-Sketch or leverage score sampling do not satisfy these properties. We also provide lower bounds, both on how small ∥x′−x∗∥2\|x'-x^*\|_2∥x′−x∗∥2​ can be, and for our new guarantee (1), showing that the subsampled randomized Fourier/Hadamard transform is nearly optimal.

View on arXiv
Comments on this paper