ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.14229
43
0

Sample-Efficient Linear Regression with Self-Selection Bias

22 February 2024
Jason Gaitonde
Elchanan Mossel
ArXiv (abs)PDFHTML
Abstract

We consider the problem of linear regression with self-selection bias in the unknown-index setting, as introduced in recent work by Cherapanamjeri, Daskalakis, Ilyas, and Zampetakis [STOC 2023]. In this model, one observes mmm i.i.d. samples (xℓ,zℓ)ℓ=1m(\mathbf{x}_{\ell},z_{\ell})_{\ell=1}^m(xℓ​,zℓ​)ℓ=1m​ where zℓ=max⁡i∈[k]{xℓTwi+ηi,ℓ}z_{\ell}=\max_{i\in [k]}\{\mathbf{x}_{\ell}^T\mathbf{w}_i+\eta_{i,\ell}\}zℓ​=maxi∈[k]​{xℓT​wi​+ηi,ℓ​}, but the maximizing index iℓi_{\ell}iℓ​ is unobserved. Here, the xℓ\mathbf{x}_{\ell}xℓ​ are assumed to be N(0,In)\mathcal{N}(0,I_n)N(0,In​) and the noise distribution ηℓ∼D\mathbf{\eta}_{\ell}\sim \mathcal{D}ηℓ​∼D is centered and independent of xℓ\mathbf{x}_{\ell}xℓ​. We provide a novel and near optimally sample-efficient (in terms of kkk) algorithm to recover w1,…,wk∈Rn\mathbf{w}_1,\ldots,\mathbf{w}_k\in \mathbb{R}^nw1​,…,wk​∈Rn up to additive ℓ2\ell_2ℓ2​-error ε\varepsilonε with polynomial sample complexity O~(n)⋅poly(k,1/ε)\tilde{O}(n)\cdot \mathsf{poly}(k,1/\varepsilon)O~(n)⋅poly(k,1/ε) and significantly improved time complexity poly(n,k,1/ε)+O(log⁡(k)/ε)O(k)\mathsf{poly}(n,k,1/\varepsilon)+O(\log(k)/\varepsilon)^{O(k)}poly(n,k,1/ε)+O(log(k)/ε)O(k). When k=O(1)k=O(1)k=O(1), our algorithm runs in poly(n,1/ε)\mathsf{poly}(n,1/\varepsilon)poly(n,1/ε) time, generalizing the polynomial guarantee of an explicit moment matching algorithm of Cherapanamjeri, et al. for k=2k=2k=2 and when it is known that D=N(0,Ik)\mathcal{D}=\mathcal{N}(0,I_k)D=N(0,Ik​). Our algorithm succeeds under significantly relaxed noise assumptions, and therefore also succeeds in the related setting of max-linear regression where the added noise is taken outside the maximum. For this problem, our algorithm is efficient in a much larger range of kkk than the state-of-the-art due to Ghosh, Pananjady, Guntuboyina, and Ramchandran [IEEE Trans. Inf. Theory 2022] for not too small ε\varepsilonε, and leads to improved algorithms for any ε\varepsilonε by providing a warm start for existing local convergence methods.

View on arXiv
Comments on this paper