We study the minimax settings of binary classification with F-score under the -smoothness assumptions on the regression function for . We propose a classification procedure which under the -margin assumption achieves the rate for the excess F-score. In this context, the Bayes optimal classifier for the F-score can be obtained by thresholding the aforementioned regression function on some level to be estimated. The proposed procedure is performed in a semi-supervised manner, that is, for the estimation of the regression function we use a labeled dataset of size and for the estimation of the optimal threshold we use an unlabeled dataset of size . Interestingly, the value of does not affect the rate of convergence, which indicates that it is "harder" to estimate the regression function than the optimal threshold . This further implies that the binary classification with F-score behaves similarly to the standard settings of binary classification. Finally, we show that the rates achieved by the proposed procedure are optimal in the minimax sense up to a constant factor.
View on arXiv