49
v1v2 (latest)

Quality control in sublinear time: a case study via random graphs

Main:53 Pages
Bibliography:5 Pages
Appendix:14 Pages
Abstract

Many algorithms are designed to work well on average over inputs. When running such an algorithm on an arbitrary input, we must ask: Can we trust the algorithm on this input? We identify a new class of algorithmic problems addressing this, which we call "Quality Control Problems." These problems are specified by a (positive, real-valued) "quality function" ρ\rho and a distribution DD such that, with high probability, a sample drawn from DD is "high quality," meaning its ρ\rho-value is near 11. The goal is to accept inputs xDx \sim D and reject potentially adversarially generated inputs xx with ρ(x)\rho(x) far from 11. The objective of quality control is thus weaker than either component problem: testing for "ρ(x)1\rho(x) \approx 1" or testing if xDx \sim D, and offers the possibility of more efficient algorithms.In this work, we consider the sublinear version of the quality control problem, where DΔ({0,1}N)D \in \Delta(\{0,1\}^N) and the goal is to solve the (D,ρ)(D ,\rho)-quality problem with o(N)o(N) queries and time. As a case study, we consider random graphs, i.e., D=Gn,pD = G_{n,p} (and N=(n2)N = \binom{n}2), and the kk-clique count function ρk:=Ck(G)/EGGn,p[Ck(G)]\rho_k := C_k(G)/\mathbb{E}_{G' \sim G_{n,p}}[C_k(G')], where Ck(G)C_k(G) is the number of kk-cliques in GG. Testing if GGn,pG \sim G_{n,p} with one sample, let alone with sublinear query access to the sample, is of course impossible. Testing if ρk(G)1\rho_k(G)\approx 1 requires pΩ(k2)p^{-\Omega(k^2)} samples. In contrast, we show that the quality control problem for Gn,pG_{n,p} (with npckn \geq p^{-ck} for some constant cc) with respect to ρk\rho_k can be tested with pO(k)p^{-O(k)} queries and time, showing quality control is provably superpolynomially more efficient in this setting. More generally, for a motif HH of maximum degree Δ(H)\Delta(H), the respective quality control problem can be solved with pO(Δ(H))p^{-O(\Delta(H))} queries and running time.

View on arXiv
Comments on this paper