Efficient Learning for Crowdsourced Regression

28 February 2017

Jungseul Ok

Abstract

Crowdsourcing platforms emerged as popular venues for purchasing human intelligence at low cost for large volume of tasks. As many low-paid workers are prone to give noisy answers, one of the fundamental questions is how to identify more reliable workers and exploit this heterogeneity to infer the true answers. Despite significant research efforts for classification tasks with discrete answers, little attention has been paid to regression tasks where the answers take continuous values. We consider the task of recovering the position of target objects, and introduce a new probabilistic model capturing the heterogeneity of the workers. We propose the belief propagation (BP) algorithm for inferring the positions and prove that it achieves optimal mean squared error by comparing its performance to that of an oracle estimator. Our experimental results on synthetic datasets confirm our theoretical predictions. We further emulate a crowdsourcing system using PASCAL visual object classes datasets and show that de-noising the crowdsourced data using BP can significantly improve the performance for the downstream vision task.

View on arXiv

Comments on this paper