ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.03195
120
21

Coresets for Decision Trees of Signals

7 October 2021
Ibrahim Jubran
Ernesto Evgeniy Sanches Shayda
I. Newman
Dan Feldman
ArXiv (abs)PDFHTML
Abstract

A kkk-decision tree ttt (or kkk-tree) is a recursive partition of a matrix (2D-signal) into k≥1k\geq 1k≥1 block matrices (axis-parallel rectangles, leaves) where each rectangle is assigned a real label. Its regression or classification loss to a given matrix DDD of NNN entries (labels) is the sum of squared differences over every label in DDD and its assigned label by ttt. Given an error parameter ε∈(0,1)\varepsilon\in(0,1)ε∈(0,1), a (k,ε)(k,\varepsilon)(k,ε)-coreset CCC of DDD is a small summarization that provably approximates this loss to \emph{every} such tree, up to a multiplicative factor of 1±ε1\pm\varepsilon1±ε. In particular, the optimal kkk-tree of CCC is a (1+ε)(1+\varepsilon)(1+ε)-approximation to the optimal kkk-tree of DDD. We provide the first algorithm that outputs such a (k,ε)(k,\varepsilon)(k,ε)-coreset for \emph{every} such matrix DDD. The size ∣C∣|C|∣C∣ of the coreset is polynomial in klog⁡(N)/εk\log(N)/\varepsilonklog(N)/ε, and its construction takes O(Nk)O(Nk)O(Nk) time. This is by forging a link between decision trees from machine learning -- to partition trees in computational geometry. Experimental results on \texttt{sklearn} and \texttt{lightGBM} show that applying our coresets on real-world data-sets boosts the computation time of random forests and their parameter tuning by up to x101010, while keeping similar accuracy. Full open source code is provided.

View on arXiv
Comments on this paper