Optimal level set estimation for non-parametric tournament and
crowdsourcing problems
Motivated by crowdsourcing, we consider a problem where we partially observe the correctness of the answers of experts on questions. In this paper, we assume that both the experts and the questions can be ordered, namely that the matrix containing the probability that expert answers correctly to question is bi-isotonic up to a permutation of it rows and columns. When , this also encompasses the strongly stochastic transitive (SST) model from the tournament literature. Here, we focus on the relevant problem of deciphering small entries of from large entries of , which is key in crowdsourcing for efficient allocation of workers to questions. More precisely, we aim at recovering a (or several) level set of the matrix up to a precision , namely recovering resp. the sets of positions in such that and . We consider, as a loss measure, the number of misclassified entries. As our main result, we construct an efficient polynomial-time algorithm that turns out to be minimax optimal for this classification problem. This heavily contrasts with existing literature in the SST model where, for the stronger reconstruction loss, statistical-computational gaps have been conjectured. More generally, this shades light on the nature of statistical-computational gaps for permutations models.
View on arXiv