138

A Unified Approach to Semi-Supervised Out-of-Distribution Detection

Abstract

One of the early weaknesses identified in deep neural networks trained for image classification tasks, was their inability to provide low confidence predictions on out-of-distribution (OOD) data, that was significantly different from the in-distribution (ID) data used to train them. Representation learning, where neural networks are trained in specific ways that improve their ability to detect OOD examples, has emerged as a promising direction to solving this problem. However, these approaches require long training times, and can be computationally inefficient at detecting OOD examples. Recent developments in Vision Transformer (ViT) foundation models\unicodex2013\unicode{x2013}large networks trained on large and diverse datasets with self-supervised approaches\unicodex2013\unicode{x2013}also show strong performance in OOD detection, and could potentially address some of these challenges. This paper presents Mixture of Exemplars (MoLAR), an approach that provides a unified way of tackling OOD detection challenges in both supervised and semi-supervised settings\unicodex2013\unicode{x2013}that is designed to be trained with a frozen, pretrained foundation model backbone. MoLAR is efficient to train, and provides strong OOD performance when only comparing the distance of OOD examples to the exemplars, a small set of images chosen to be representative of the dataset. As a result, determining if an image is OOD with MoLAR is no more expensive than classifying an image. Extensive experiments demonstrate the superior OOD detection performance of MoLAR in comparison to comparable approaches, and also the strong performance of MoLAR in semi-supervised settings.

View on arXiv
Comments on this paper