A Method for Restoring the Training Set Distribution in an Image Classifier

5 February 2018

Abstract

Convolutional Neural Networks are a well-known staple of modern image classification. However, it can be difficult to assess the quality and robustness of such models. Deep models are known to perform well on a given training and estimation set, but can easily be fooled by data that is specifically generated for the purpose. It has been shown that one can produce an artificial example that does not represent the desired class, but activates the network in the desired way. This paper describes a new way of reconstructing a sample from the training set distribution of an image classifier without deep knowledge about the underlying distribution. This enables access to the elements of images that most influence the decision of a convolutional network and to extract meaningful information about the training distribution.

View on arXiv

Comments on this paper