Sparse Dimensionality Reduction Revisited

The sparse Johnson-Lindenstrauss transform is one of the central techniques in dimensionality reduction. It supports embedding a set of points in into dimensions while preserving all pairwise distances to within . Each input point is embedded to , where is an matrix having non-zeros per column, allowing for an embedding time of . Since the sparsity of governs the embedding time, much work has gone into improving the sparsity . The current state-of-the-art by Kane and Nelson (JACM'14) shows that suffices. This is almost matched by a lower bound of by Nelson and Nguyen (STOC'13). Previous work thus suggests that we have near-optimal embeddings. In this work, we revisit sparse embeddings and identify a loophole in the lower bound. Concretely, it requires , which in many applications is unrealistic. We exploit this loophole to give a sparser embedding when , achieving . We also complement our analysis by strengthening the lower bound of Nelson and Nguyen to hold also when , thereby matching the first term in our new sparsity upper bound. Finally, we also improve the sparsity of the best oblivious subspace embeddings for optimal embedding dimensionality.
View on arXiv