There has been considerable recent interest in distribution-tests whose run-time and sample requirements are sublinear in the domain-size . We study two of the most important tests under the conditional-sampling model where each query specifies a subset of the domain, and the response is a sample drawn from according to the underlying distribution. For identity testing, which asks whether the underlying distribution equals a specific given distribution or -differs from it, we reduce the known time and sample complexities from to , thereby matching the information theoretic lower bound. For closeness testing, which asks whether two distributions underlying observed data sets are equal or different, we reduce existing complexity from to an even sub-logarithmic thus providing a better bound to an open problem in Bertinoro Workshop on Sublinear Algorithms [Fisher, 2004].
View on arXiv