200

Feature selection and classification of high-dimensional normal vectors with possibly large number of classes

Abstract

We consider high-dimensional multi-class classification of normal vectors, where unlike standard assumptions, the number of classes may be also large. We derive the (non-asymptotic) conditions on effects of significant features, and the low and the upper bounds for distances between classes required for successful feature selection and classification with a given accuracy. In particular, we present an interesting and, at first glance, somewhat counter-intuitive phenomenon that the precision of classification can improve as a number of classes grows. This is due to more accurate feature selection since even weak significant features, which are not sufficiently strong to be manifested in a coarse classification, can nevertheless have a strong impact when the number of classes is large. The presented simulation study illustrates the performance of the procedure.

View on arXiv
Comments on this paper