151

Random Separating Hyperplane Theorem and Learning Polytopes

International Colloquium on Automata, Languages and Programming (ICALP), 2023
Abstract

The Separating Hyperplane theorem is a fundamental result in Convex Geometry with myriad applications. Our first result, Random Separating Hyperplane Theorem (RSH), is a strengthening of this for polytopes. \rsh\rsh asserts that if the distance between aa and a polytope KK with kk vertices and unit diameter in d\Re^d is at least δ\delta, where δ\delta is a fixed constant in (0,1)(0,1), then a randomly chosen hyperplane separates aa and KK with probability at least 1/poly(k)1/poly(k) and margin at least Ω(δ/d)\Omega \left(\delta/\sqrt{d} \right). An immediate consequence of our result is the first near optimal bound on the error increase in the reduction from a Separation oracle to an Optimization oracle over a polytope. RSH has algorithmic applications in learning polytopes. We consider a fundamental problem, denoted the ``Hausdorff problem'', of learning a unit diameter polytope KK within Hausdorff distance δ\delta, given an optimization oracle for KK. Using RSH, we show that with polynomially many random queries to the optimization oracle, KK can be approximated within error O(δ)O(\delta). To our knowledge this is the first provable algorithm for the Hausdorff Problem. Building on this result, we show that if the vertices of KK are well-separated, then an optimization oracle can be used to generate a list of points, each within Hausdorff distance O(δ)O(\delta) of KK, with the property that the list contains a point close to each vertex of KK. Further, we show how to prune this list to generate a (unique) approximation to each vertex of the polytope. We prove that in many latent variable settings, e.g., topic modeling, LDA, optimization oracles do exist provided we project to a suitable SVD subspace. Thus, our work yields the first efficient algorithm for finding approximations to the vertices of the latent polytope under the well-separatedness assumption.

View on arXiv
Comments on this paper