Transfer Learning through Greedy Subset Selection

Computer Vision and Image Understanding (CVIU), 2014

6 August 2014

Abstract

In this work we study the binary transfer learning problem involving $10^{2}$ - $10^{3}$ sources. We focus on how to select sources from the large pool and how to combine them to yield a good performance on a target task. In particular, we consider the transfer learning setting where one does not have direct access to the source data, but rather employs the source hypotheses trained from them. Building on results on greedy algorithms, we propose an efficient algorithm that selects relevant source hypotheses and feature dimensions simultaneously. On three computer vision datasets we achieve state-of-the-art results, substantially outperforming both popular feature selection and transfer learning baselines when transferring in a small-sample setting. Our experiments involve up to $1000$ classes, totalling $1.2$ million examples, with only $11$ to $20$ training examples from the target domain. We corroborate our findings showing theoretically that, under reasonable assumptions on the source hypotheses, our algorithm can learn effectively from few examples.

View on arXiv

Comments on this paper