52
13

Provable guarantees for decision tree induction: the agnostic setting

Abstract

We give strengthened provable guarantees on the performance of widely employed and empirically successful {\sl top-down decision tree learning heuristics}. While prior works have focused on the realizable setting, we consider the more realistic and challenging {\sl agnostic} setting. We show that for all monotone functions~ff and parameters sNs\in \mathbb{N}, these heuristics construct a decision tree of size sO~((logs)/ε2)s^{\tilde{O}((\log s)/\varepsilon^2)} that achieves error opts+ε\le \mathsf{opt}_s + \varepsilon, where opts\mathsf{opt}_s denotes the error of the optimal size-ss decision tree for ff. Previously, such a guarantee was not known to be achievable by any algorithm, even one that is not based on top-down heuristics. We complement our algorithmic guarantee with a near-matching sΩ~(logs)s^{\tilde{\Omega}(\log s)} lower bound.

View on arXiv
Comments on this paper