GRANDE: Gradient-Based Decision Tree Ensembles

International Conference on Learning Representations (ICLR), 2023

29 September 2023

Sascha Marton

Stefan Lüdtke

Christian Bartelt

Heiner Stuckenschmidt

LMTD

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (91★)

Main:9 Pages

5 Figures

Bibliography:4 Pages

29 Tables

Appendix:14 Pages

Abstract

Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific gradient-based methods due to their high flexibility. In this paper, we propose $\text{GRANDE}$ , $\text{GRA}$ die $\text{N}$ t-Based $\text{D}$ ecision Tree $\text{E}$ nsembles, a novel approach for learning hard, axis-aligned decision tree ensembles using end-to-end gradient descent. GRANDE is based on a dense representation of tree ensembles, which affords to use backpropagation with a straight-through operator to jointly optimize all model parameters. Our method combines axis-aligned splits, which is a useful inductive bias for tabular data, with the flexibility of gradient-based optimization. Furthermore, we introduce an advanced instance-wise weighting that facilitates learning representations for both, simple and complex relations, within a single model. We conducted an extensive evaluation on a predefined benchmark with 19 classification datasets and demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets.

View on arXiv

Comments on this paper