19
3

Fast (1+ε)(1+\varepsilon)-Approximation Algorithms for Binary Matrix Factorization

Abstract

We introduce efficient (1+ε)(1+\varepsilon)-approximation algorithms for the binary matrix factorization (BMF) problem, where the inputs are a matrix A{0,1}n×d\mathbf{A}\in\{0,1\}^{n\times d}, a rank parameter k>0k>0, as well as an accuracy parameter ε>0\varepsilon>0, and the goal is to approximate A\mathbf{A} as a product of low-rank factors U{0,1}n×k\mathbf{U}\in\{0,1\}^{n\times k} and V{0,1}k×d\mathbf{V}\in\{0,1\}^{k\times d}. Equivalently, we want to find U\mathbf{U} and V\mathbf{V} that minimize the Frobenius loss UVAF2\|\mathbf{U}\mathbf{V} - \mathbf{A}\|_F^2. Before this work, the state-of-the-art for this problem was the approximation algorithm of Kumar et. al. [ICML 2019], which achieves a CC-approximation for some constant C576C\ge 576. We give the first (1+ε)(1+\varepsilon)-approximation algorithm using running time singly exponential in kk, where kk is typically a small integer. Our techniques generalize to other common variants of the BMF problem, admitting bicriteria (1+ε)(1+\varepsilon)-approximation algorithms for LpL_p loss functions and the setting where matrix operations are performed in F2\mathbb{F}_2. Our approach can be implemented in standard big data models, such as the streaming or distributed models.

View on arXiv
Comments on this paper