84
3

Empowering Elasticsearch with Exact and Fast rr-Neighbor Search in Hamming Space

Abstract

A growing interest has been witnessed recently in building nearest neighbor search solutions within Elasticsearch--one of the most popular full-text search engines. In this paper, we focus specifically on Hamming space nearest neighbor search using Elasticsearch. By combining three techniques: bit operation, substring filtering and data preprocessing with permutation, we develop a novel approach called FENSHSES (Fast Exact Neighbor Search in Hamming Space on Elasticsearch), which achieves dramatic speed-ups over the existing term match baseline. This will empower Elasticsearch with the capability of fast information retrieval even when documents (e.g., texts, images and sounds) are represented with binary codes--a common practice in nowadays semantic representation learning.

View on arXiv
Comments on this paper