Revisiting Sorted Table Search Procedures with Machine Learning: Methodological Insights via an Experimental Study

20 July 2020

Abstract

Learned Data Structures, a recent research area overlapping Machine Learning and Data Structure, uses learning from data in order to improve the time/space performance of classic data structures, e.g., Search Trees, Hash Tables and Bloom Filters. It is also claimed that the use of GPUs for the learning phase contributes to make those approaches widespread. Yet, the role of GPU computing with respect to Learned Data Structures is unclear. Moreover, none of such studies has been devoted to algorithms for Sorted Table Search, that are both fundamental and still widely used in several application domains. We provide such a study here, via a systematic experimental comparison of known efficient implementations of Sorted Table Search procedures and their Learned counterparts developed here, obtaining valuable methodological insights into the use of those latter. Specifically, we characterize the scenarios in which those latter can be profitably used with respect to the former, accounting for both CPU and GPU computing. We also formalize an Algorithmic Paradigm of Learned Dichotomic Sorted Table Search procedures that naturally complements the Learned one and that characterizes most of the known Sorted Table Search Procedures as having a "learning phase" that approximates Simple Linear Regression.

View on arXiv

Comments on this paper