Fighting biases with dynamic boosting

28 June 2017

Anna Veronika Dorogush

Aleksandr Vorobev

Abstract

While gradient boosting algorithms are the workhorse of modern industrial machine learning and data science, all current implementations are susceptible to a non-trivial but damaging form of label leakage. It results in a systematic bias in pointwise gradient estimates that lead to reduced accuracy. This paper formally analyzes the issue and presents solutions that produce unbiased pointwise gradient estimates. Experimental results demonstrate that our open-source implementation of gradient boosting that incorporates the proposed algorithm produces state-of-the-art results outperforming popular gradient boosting implementations.

View on arXiv

Comments on this paper