ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.13851
14
2

Approximate Maximum Halfspace Discrepancy

25 June 2021
Michael Matheny
J. Phillips
ArXivPDFHTML
Abstract

Consider the geometric range space (X,Hd)(X, \mathcal{H}_d)(X,Hd​) where X⊂RdX \subset \mathbb{R}^dX⊂Rd and Hd\mathcal{H}_dHd​ is the set of ranges defined by ddd-dimensional halfspaces. In this setting we consider that XXX is the disjoint union of a red and blue set. For each halfspace h∈Hdh \in \mathcal{H}_dh∈Hd​ define a function Φ(h)\Phi(h)Φ(h) that measures the "difference" between the fraction of red and fraction of blue points which fall in the range hhh. In this context the maximum discrepancy problem is to find the h∗=arg⁡max⁡h∈(X,Hd)Φ(h)h^* = \arg \max_{h \in (X, \mathcal{H}_d)} \Phi(h)h∗=argmaxh∈(X,Hd​)​Φ(h). We aim to instead find an h^\hat{h}h^ such that Φ(h∗)−Φ(h^)≤ε\Phi(h^*) - \Phi(\hat{h}) \le \varepsilonΦ(h∗)−Φ(h^)≤ε. This is the central problem in linear classification for machine learning, in spatial scan statistics for spatial anomaly detection, and shows up in many other areas. We provide a solution for this problem in O(∣X∣+(1/εd)log⁡4(1/ε))O(|X| + (1/\varepsilon^d) \log^4 (1/\varepsilon))O(∣X∣+(1/εd)log4(1/ε)) time, which improves polynomially over the previous best solutions. For d=2d=2d=2 we show that this is nearly tight through conditional lower bounds. For different classes of Φ\PhiΦ we can either provide a Ω(∣X∣3/2−o(1))\Omega(|X|^{3/2 - o(1)})Ω(∣X∣3/2−o(1)) time lower bound for the exact solution with a reduction to APSP, or an Ω(∣X∣+1/ε2−o(1))\Omega(|X| + 1/\varepsilon^{2-o(1)})Ω(∣X∣+1/ε2−o(1)) lower bound for the approximate solution with a reduction to 3SUM. A key technical result is a ε\varepsilonε-approximate halfspace range counting data structure of size O(1/εd)O(1/\varepsilon^d)O(1/εd) with O(log⁡(1/ε))O(\log (1/\varepsilon))O(log(1/ε)) query time, which we can build in O(∣X∣+(1/εd)log⁡4(1/ε))O(|X| + (1/\varepsilon^d) \log^4 (1/\varepsilon))O(∣X∣+(1/εd)log4(1/ε)) time.

View on arXiv
Comments on this paper