20
2

Learning Thresholds with Latent Values and Censored Feedback

Abstract

In this paper, we investigate a problem of actively learning threshold in latent space, where the unknown reward g(γ,v)g(\gamma, v) depends on the proposed threshold γ\gamma and latent value vv and it can be onlyonly achieved if the threshold is lower than or equal to the unknown latent value. This problem has broad applications in practical scenarios, e.g., reserve price optimization in online auctions, online task assignments in crowdsourcing, setting recruiting bars in hiring, etc. We first characterize the query complexity of learning a threshold with the expected reward at most ϵ\epsilon smaller than the optimum and prove that the number of queries needed can be infinitely large even when g(γ,v)g(\gamma, v) is monotone with respect to both γ\gamma and vv. On the positive side, we provide a tight query complexity Θ~(1/ϵ3)\tilde{\Theta}(1/\epsilon^3) when gg is monotone and the CDF of value distribution is Lipschitz. Moreover, we show a tight Θ~(1/ϵ3)\tilde{\Theta}(1/\epsilon^3) query complexity can be achieved as long as gg satisfies one-sided Lipschitzness, which provides a complete characterization for this problem. Finally, we extend this model to an online learning setting and demonstrate a tight Θ(T2/3)\Theta(T^{2/3}) regret bound using continuous-arm bandit techniques and the aforementioned query complexity results.

View on arXiv
Comments on this paper