The Geometry of Generalized Binary Search
This paper investigates the problem of determining a binary-valued function through a sequence of strategically selected queries. The focus is an algorithm called Generalized Binary Search (GBS). GBS is a well-known greedy algorithm for determining a binary-valued function through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses under consideration into two disjoint subsets, a natural generalization of the idea underlying classic binary search and Shannon-Fano coding. GBS is used in many applications including channel coding, experimental design, fault testing, machine diagnostics, disease diagnosis, job scheduling, image processing, computer vision, and machine learning. This paper develops novel incoherence and geometric conditions under which GBS achieves the information-theoretically optimal query complexity; i.e., given a collection of N hypotheses, GBS terminates with the correct function in O(log N) queries. Furthermore, a noise-tolerant version of GBS is developed that also achieves the optimal query complexity. These results are applied to learning multidimensional threshold functions, a problem arising routinely in image processing and machine learning.
View on arXiv