Near Memory Similarity Search on Automata Processors

9 August 2016

Vincent T. Lee

Abstract

Embedded devices and multimedia applications today generate unprecedented volumes of data which must be indexed and made searchable. As a result, similarity search has become a critical idiom for many modern data intensive applications in natural language processing (NLP), vision, and robotics. At its core, similarity search is implemented using k-nearest neighbors (kNN) where computation consists of highly parallel distance calculations and a global top-k sort. In contemporary von-Neumann architectures, kNN is bottlenecked by data movement limiting throughput and latency. Near-data processing has been proposed as a solution to reduce data movement for these memory bound applications improving run time and energy efficiency. In this paper, we codesign and evaluate kNN on the Micron Automata Processor (AP), which is a non-von Neumann architecture for near-data processing designed for automata evaluation. We present a novel nondeterministic finite automaton design and show how using temporal encodings it can be used to evaluate kNN. We evaluate our design's performance on the AP compared to state-of-the-art CPU, GPU, and FPGA implementations and show that the current generation AP hardware can achieve 52.6x speedup over multicore processors while maintaining competitive energy efficiency gains as heterogeneous computing substrates. Finally, we propose several automata optimizations techniques and simple architectural extensions, evaluate their potential impact, and show how they can achieve an additional 73.6x performance improvement on next generation AP hardware.

View on arXiv

Comments on this paper