Feature Selection based on the Local Lift Dependence Scale
This paper uses a classical approach to feature selection: minimization of a cost function of an estimated joint distribution. However, the space in which such minimization is performed will be extended from the Boolean lattice generated by the power set of the features, to the Boolean lattice generated by the power set of the features support, i.e., the values they can assume. In this approach we may not only select the features that are most related to a variable , but also select the values of the features that most influence the variable or that are most prone to have a specific value of . The \textit{Local Lift Dependence Scale}, an scale for measuring variable dependence in multiple \textit{resolutions}, is used to develop the cost functions, which are based on classical dependence measures, as the Mutual Information, Cross Entropy and Kullback-Leibler Divergence. The proposed approach is applied to a dataset consisting of student performances on a university entrance exam and on undergraduate courses. This approach is used to select the subjects of the entrance exam, and the performances on them, that are most related to the performance on the undergraduate courses.
View on arXiv