ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.02261
11
4

Optimal Sketching Bounds for Sparse Linear Regression

5 April 2023
Tung Mai
Alexander Munteanu
Cameron Musco
Anup B. Rao
Chris Schwiegelshohn
David P. Woodruff
ArXivPDFHTML
Abstract

We study oblivious sketching for kkk-sparse linear regression under various loss functions such as an ℓp\ell_pℓp​ norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse ℓ2\ell_2ℓ2​ norm regression, there is a distribution over oblivious sketches with Θ(klog⁡(d/k)/ε2)\Theta(k\log(d/k)/\varepsilon^2)Θ(klog(d/k)/ε2) rows, which is tight up to a constant factor. This extends to ℓp\ell_pℓp​ loss with an additional additive O(klog⁡(k/ε)/ε2)O(k\log(k/\varepsilon)/\varepsilon^2)O(klog(k/ε)/ε2) term in the upper bound. This establishes a surprising separation from the related sparse recovery problem, which is an important special case of sparse regression. For this problem, under the ℓ2\ell_2ℓ2​ norm, we observe an upper bound of O(klog⁡(d)/ε+klog⁡(k/ε)/ε2)O(k \log (d)/\varepsilon + k\log(k/\varepsilon)/\varepsilon^2)O(klog(d)/ε+klog(k/ε)/ε2) rows, showing that sparse recovery is strictly easier to sketch than sparse regression. For sparse regression under hinge-like loss functions including sparse logistic and sparse ReLU regression, we give the first known sketching bounds that achieve o(d)o(d)o(d) rows showing that O(μ2klog⁡(μnd/ε)/ε2)O(\mu^2 k\log(\mu n d/\varepsilon)/\varepsilon^2)O(μ2klog(μnd/ε)/ε2) rows suffice, where μ\muμ is a natural complexity parameter needed to obtain relative error bounds for these loss functions. We again show that this dimension is tight, up to lower order terms and the dependence on μ\muμ. Finally, we show that similar sketching bounds can be achieved for LASSO regression, a popular convex relaxation of sparse regression, where one aims to minimize ∥Ax−b∥22+λ∥x∥1\|Ax-b\|_2^2+\lambda\|x\|_1∥Ax−b∥22​+λ∥x∥1​ over x∈Rdx\in\mathbb{R}^dx∈Rd. We show that sketching dimension O(log⁡(d)/(λε)2)O(\log(d)/(\lambda \varepsilon)^2)O(log(d)/(λε)2) suffices and that the dependence on ddd and λ\lambdaλ is tight.

View on arXiv
Comments on this paper