RACE: Sub-Linear Memory Sketches for Approximate Near-Neighbor Search on
Streaming Data
We present the first sublinear memory sketch which can be queried to find the nearest neighbors in a dataset. Our online sketching algorithm can compress an -element dataset to a sketch of size in time, where when the query satisfies a data-dependent near-neighbor stability condition. We achieve data-dependent sublinear space by combining recent advances in locality sensitive hashing (LSH)-based estimators with compressed sensing. Our results shed new light on the memory-accuracy tradeoff for near-neighbor search. The techniques presented reveal a deep connection between the fundamental compressed sensing (or heavy hitters) recovery problem and near-neighbor search, leading to new insight for geometric search problems and implications for sketching algorithms.
View on arXiv