114

Algorithms for Discrepancy, Matchings, and Approximations: Fast, Simple, and Practical

Abstract

We study one of the key tools in data approximation and optimization: low-discrepancy colorings. Formally, given a finite set system (X,S)(X,\mathcal S), the \emph{discrepancy} of a two-coloring χ:X{1,1}\chi:X\to\{-1,1\} is defined as maxSSχ(S)\max_{S \in \mathcal S}|{\chi(S)}|, where χ(S)=xSχ(x)\chi(S)=\sum\limits_{x \in S}\chi(x). We propose a randomized algorithm which, for any d>0d>0 and (X,S)(X,\mathcal S) with dual shatter function π(k)=O(kd)\pi^*(k)=O(k^d), returns a coloring with expected discrepancy O(X11/dlogS)O\left({\sqrt{|X|^{1-1/d}\log|\mathcal S|}}\right) (this bound is tight) in time O~(SX1/d+X2+1/d)\tilde O\left({|\mathcal S|\cdot|X|^{1/d}+|X|^{2+1/d}}\right), improving upon the previous-best time of O(SX3)O\left(|\mathcal S|\cdot|X|^3\right) by at least a factor of X21/d|X|^{2-1/d} when SX|\mathcal S|\geq|X|. This setup includes many geometric classes, families of bounded dual VC-dimension, and others. As an immediate consequence, we obtain an improved algorithm to construct ε\varepsilon-approximations of sub-quadratic size. Our method uses primal-dual reweighing with an improved analysis of randomly updated weights and exploits the structural properties of the set system via matchings with low crossing number -- a fundamental structure in computational geometry. In particular, we get the same X21/d|X|^{2-1/d} factor speed-up on the construction time of matchings with crossing number O(X11/d)O\left({|X|^{1-1/d}}\right), which is the first improvement since the 1980s. The proposed algorithms are very simple, which makes it possible, for the first time, to compute colorings with near-optimal discrepancies and near-optimal sized approximations for abstract and geometric set systems in dimensions higher than 22.

View on arXiv
Comments on this paper