ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1306.3690
116
89
v1v2v3v4 (latest)

Do Semidefinite Relaxations Solve Sparse PCA up to the Information Limit?

16 June 2013
Robert Krauthgamer
B. Nadler
Dan Vilenchik
ArXiv (abs)PDFHTML
Abstract

Estimating the leading principal components of data, assuming they are sparse, is a central task in modern high-dimensional statistics. Many algorithms were developed for this sparse PCA problem, from simple diagonal thresholding to sophisticated semidefinite programming (SDP) methods. A key theoretical question is under what conditions can such algorithms recover the sparse principal components. We study this question for a single-spike model with an ℓ0\ell_0ℓ0​-sparse eigenvector, in the asymptotic regime as dimension ppp and sample size nnn both tend to infinity. Amini and Wainwright (2009) proved that for sparsity levels k≥Ω(n/log⁡p)k\geq\Omega(n/\log p)k≥Ω(n/logp), no algorithm, efficient or not, can reliably recover the sparse eigenvector. In contrast, for k≤O(n/log⁡p)k\leq O(\sqrt{n/\log p})k≤O(n/logp​), diagonal thresholding is consistent. It was further conjectured that an SDP approach may close this gap between computational and information limits. We prove that when k≥Ω(n)k \geq \Omega(\sqrt{n})k≥Ω(n​) the proposed SDP approach, at least in its standard usage, cannot recover the sparse spike. In fact, we conjecture that in the single-spike model, no computationally-efficient algorithm can recover a spike of ℓ0\ell_0ℓ0​-sparsity k≥Ω(n)k\geq \Omega(\sqrt{n})k≥Ω(n​). Finally, we present empirical results suggesting that up to sparsity levels k=O(n)k=O(\sqrt{n})k=O(n​), recovery is possible by a simple covariance thresholding algorithm.

View on arXiv
Comments on this paper