Guessing Efficiently for Constrained Subspace Approximation
In this paper we study constrained subspace approximation problem. Given a set of points in , the goal of the {\em subspace approximation} problem is to find a dimensional subspace that best approximates the input points. More precisely, for a given , we aim to minimize the th power of the norm of the error vector , where denotes the projection matrix onto the subspace and the norms are Euclidean. In \emph{constrained} subspace approximation (CSA), we additionally have constraints on the projection matrix . In its most general form, we require to belong to a given subset that is described explicitly or implicitly.We introduce a general framework for constrained subspace approximation. Our approach, that we term coreset-guess-solve, yields either -multiplicative or -additive approximations for a variety of constraints. We show that it provides new algorithms for partition-constrained subspace approximation with applications to {\it fair} subspace approximation, -means clustering, and projected non-negative matrix factorization, among others. Specifically, while we reconstruct the best known bounds for -means clustering in Euclidean spaces, we improve the known results for the remainder of the problems.
View on arXiv