We study the use of "sign -stable random projections" (where ) for building basic data processing tools in the context of large-scale machine learning applications (e.g., classification, regression, clustering, and near-neighbor search). After the processing by sign stable random projections, the inner products of the processed data approximate various types of nonlinear kernels depending on the value of . Thus, this approach provides an effective strategy for approximating nonlinear learning algorithms essentially at the cost of linear learning. When , it is known that the corresponding nonlinear kernel is the arc-cosine kernel. When , the procedure approximates the arc-cos- kernel (under certain condition). When , it corresponds to the resemblance kernel. From practitioners' perspective, the method of sign -stable random projections is ready to be tested for large-scale learning applications, where can be simply viewed as a tuning parameter. What is missing in the literature is an extensive empirical study to show the effectiveness of sign stable random projections, especially for or 1. The paper supplies such a study on a wide variety of classification datasets. In particular, we compare shoulder-by-shoulder sign stable random projections with the recently proposed "0-bit consistent weighted sampling (CWS)" (Li 2015).
View on arXiv