Designing labeled graph classifiers by exploiting the Rényi entropy of the dissimilarity representation

Representing patterns as labeled graphs is becoming increasingly common in the broad field of computational intelligence. Accordingly, a wide repertoire of pattern recognition tools, such as classifiers and knowledge discovery procedures, are nowadays available and tested for various labeled graph data types. However, the design of effective learning and mining procedures operating in the space of labeled graphs is still a challenging problem, especially from the computational complexity viewpoint. In this paper, we present a major improvement of a general-purpose classifier for graphs, which is conceived on an interplay among dissimilarity representation, clustering, information-theoretic techniques, and evolutionary optimization. The improvement focuses on a specific key subroutine devised to compress the input data. We prove different theorems which are fundamental to the setting of such a compression operation. We demonstrate the effectiveness of the resulting classifier by benchmarking the developed variants on well-known datasets of labeled graphs, considering as distinct performance indicators the classification accuracy, the computing time, and the parsimony in terms of structural complexity of the synthesized classification model. The results show state-of-the-art standards in terms of test set accuracy and a considerable speed-up for what concerns the computing time.
View on arXiv