104

Optimal level set estimation for non-parametric tournament and crowdsourcing problems

Main:15 Pages
5 Figures
Bibliography:3 Pages
Appendix:36 Pages
Abstract

Motivated by crowdsourcing, we consider a problem where we partially observe the correctness of the answers of nn experts on dd questions. In this paper, we assume that both the experts and the questions can be ordered, namely that the matrix MM containing the probability that expert ii answers correctly to question jj is bi-isotonic up to a permutation of it rows and columns. When n=dn=d, this also encompasses the strongly stochastic transitive (SST) model from the tournament literature. Here, we focus on the relevant problem of deciphering small entries of MM from large entries of MM, which is key in crowdsourcing for efficient allocation of workers to questions. More precisely, we aim at recovering a (or several) level set pp of the matrix up to a precision hh, namely recovering resp. the sets of positions (i,j)(i,j) in MM such that Mij>p+hM_{ij}>p+h and Mi,j<phM_{i,j}<p-h. We consider, as a loss measure, the number of misclassified entries. As our main result, we construct an efficient polynomial-time algorithm that turns out to be minimax optimal for this classification problem. This heavily contrasts with existing literature in the SST model where, for the stronger reconstruction loss, statistical-computational gaps have been conjectured. More generally, this shades light on the nature of statistical-computational gaps for permutations models.

View on arXiv
Comments on this paper