On the Learning with Augmented Class via Forests

14 May 2025

Abstract

Decision trees and forests have achieved successes in various real applications, most working with all testing classes known in training data. In this work, we focus on learning with augmented class via forests, where an augmented class may appear in testing data yet not in training data. We incorporate information of augmented class into trees' splitting, i.e., a new splitting criterion, called augmented Gini impurity, is introduced to exploit some unlabeled data from testing distribution. We then develop the approach named Learning with Augmented Class via Forests (LACForest), which constructs shallow forests based on the augmented Gini impurity and then splits forests with pseudo-labeled augmented instances for better performance. We also develop deep neural forests with a novel optimization objective based on our augmented Gini impurity, so as to utilize the representation power of neural networks for forests. Theoretically, we present the convergence analysis for augmented Gini impurity, and finally conduct experiments to verify the effectiveness of our approaches. The code is available atthis https URL.

View on arXiv

@article{xu2025_2505.09294,
  title={ On the Learning with Augmented Class via Forests },
  author={ Fan Xu and Wuyang Chen and Wei Gao },
  journal={arXiv preprint arXiv:2505.09294},
  year={ 2025 }
}

Comments on this paper