ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1410.6973
96
31
v1v2 (latest)

Differentially- and non-differentially-private random decision trees

26 October 2014
Mariusz Bojarski
A. Choromańska
K. Choromanski
Yann LeCun
ArXiv (abs)PDFHTML
Abstract

We consider supervised learning with random decision trees, where the tree construction is completely random. The method was used as a heuristic working well in practice despite the simplicity of the setting, but with almost no theoretical guarantees. The goal of this paper is to shed new light on the entire paradigm. We provide strong theoretical guarantees regarding learning with random decision trees. We present and compare three different variants of the algorithm that have minimal memory requirements: majority voting, threshold averaging and probabilistic averaging. The random structure of the tree enables us to adapt our setting to the differentially-private scenario thus we also propose differentially-private versions of all three schemes. We give upper bounds on the generalization error and mathematically explain how the accuracy depends on the number of random decision trees. Furthermore, we prove that only logarithmic number of independently selected random decision trees suffice to correctly classify most of the data, even when differential-privacy guarantees must be maintained. Such an analysis has never been done before. We empirically show that majority voting and threshold averaging give the best accuracy, also for conservative users requiring high privacy guarantees. In particular, a simple majority voting rule, that was not considered before in the context of differentially-private learning, is an especially good candidate for the differentially-private classifier since it is much less sensitive to the choice of forest parameters than other methods.

View on arXiv
Comments on this paper