74

Sign Stable Random Projections for Large-Scale Learning

Abstract

We study the use of "sign α\alpha-stable random projections" (where 0<α20<\alpha\leq 2) for building basic data processing tools in the context of large-scale machine learning applications (e.g., classification, regression, clustering, and near-neighbor search). After the processing by sign stable random projections, the inner products of the processed data approximate various types of nonlinear kernels depending on the value of α\alpha. Thus, this approach provides an effective strategy for approximating nonlinear learning algorithms essentially at the cost of linear learning. When α=2\alpha =2, it is known that the corresponding nonlinear kernel is the arc-cosine kernel. When α=1\alpha=1, the procedure approximates the arc-cos-χ2\chi^2 kernel (under certain condition). When α0+\alpha\rightarrow0+, it corresponds to the resemblance kernel. From practitioners' perspective, the method of sign α\alpha-stable random projections is ready to be tested for large-scale learning applications, where α\alpha can be simply viewed as a tuning parameter. What is missing in the literature is an extensive empirical study to show the effectiveness of sign stable random projections, especially for α2\alpha\neq 2 or 1. The paper supplies such a study on a wide variety of classification datasets. In particular, we compare shoulder-by-shoulder sign stable random projections with the recently proposed "0-bit consistent weighted sampling (CWS)" (Li 2015).

View on arXiv
Comments on this paper