v1v2 (latest)

Quality control in sublinear time: a case study via random graphs

22 August 2025

Cassandra Marcussen

Ronitt Rubinfeld

Madhu Sudan

ArXiv (abs)PDF HTML Github

Main:53 Pages

Bibliography:5 Pages

Appendix:14 Pages

Abstract

Many algorithms are designed to work well on average over inputs. When running such an algorithm on an arbitrary input, we must ask: Can we trust the algorithm on this input? We identify a new class of algorithmic problems addressing this, which we call "Quality Control Problems." These problems are specified by a (positive, real-valued) "quality function" $\rho$ and a distribution $D$ such that, with high probability, a sample drawn from $D$ is "high quality," meaning its $\rho$ -value is near $1$ . The goal is to accept inputs $x \sim D$ and reject potentially adversarially generated inputs $x$ with $\rho(x)$ far from $1$ . The objective of quality control is thus weaker than either component problem: testing for " $\rho(x) \approx 1$ " or testing if $x \sim D$ , and offers the possibility of more efficient algorithms.In this work, we consider the sublinear version of the quality control problem, where $D \in \Delta(\{0,1\}^N)$ and the goal is to solve the $(D ,\rho)$ -quality problem with $o(N)$ queries and time. As a case study, we consider random graphs, i.e., $D = G_{n,p}$ (and $N = \binom{n}2$ ), and the $k$ -clique count function $\rho_k := C_k(G)/\mathbb{E}_{G' \sim G_{n,p}}[C_k(G')]$ , where $C_k(G)$ is the number of $k$ -cliques in $G$ . Testing if $G \sim G_{n,p}$ with one sample, let alone with sublinear query access to the sample, is of course impossible. Testing if $\rho_k(G)\approx 1$ requires $p^{-\Omega(k^2)}$ samples. In contrast, we show that the quality control problem for $G_{n,p}$ (with $n \geq p^{-ck}$ for some constant $c$ ) with respect to $\rho_k$ can be tested with $p^{-O(k)}$ queries and time, showing quality control is provably superpolynomially more efficient in this setting. More generally, for a motif $H$ of maximum degree $\Delta(H)$ , the respective quality control problem can be solved with $p^{-O(\Delta(H))}$ queries and running time.

View on arXiv

Comments on this paper