Undefined class-label detection vs out-of-distribution detection

24 February 2021

Abstract

We introduce a new problem, that of undefined class-label (UCL) detection. For instance, if we try to classify an image of a radio as cat vs dog, there will be no well-defined class label. In contrast, in out-of-distribution (OOD) detection, we are interested in the related but different problem of identifying regions of the input space with little training data, which might result in poor classifier performance. This difference is critical: it is quite possible for there to be a region of the input space where little training data is available but where class-labels are well-defined. Likewise, there may be regions with lots of training data, but without well-defined class-labels (though in practice this would often be the result of a bug in the labelling pipeline). We note that certain methods originally intended to detect OOD inputs might actually be detecting UCL points and develop a method for training on UCL points based on a generative model of data-curation originally used to explain the cold posterior effect in Bayesian neural networks. This approach gives superior performance to past methods originally intended for OOD detection.

View on arXiv

Comments on this paper