Teaching and compressing for low VC-dimension

22 February 2015

Abstract

In this work we study the quantitative relation between VC-dimension and two other basic parameters related to learning and teaching. We present relatively efficient constructions of {\em sample compression schemes} and {\em teaching sets} for classes of low VC-dimension. Let $C$ be a finite boolean concept class of VC-dimension $d$ . Set $k = O(d 2^d \log \log |C|)$ . We construct sample compression schemes of size $k$ for $C$ , with additional information of $k \log(k)$ bits. Roughly speaking, given any list of $C$ -labelled examples of arbitrary length, we can retain only $k$ labeled examples in a way that allows to recover the labels of all others examples in the list. We also prove that there always exists a concept $c$ in $C$ with a teaching set (i.e. a list of $c$ -labelled examples uniquely identifying $c$ ) of size $k$ . Equivalently, we prove that the recursive teaching dimension of $C$ is at most $k$ . The question of constructing sample compression schemes for classes of small VC-dimension was suggested by Littlestone and Warmuth (1986), and the problem of constructing teaching sets for classes of small VC-dimension was suggested by Kuhlmann (1999). Previous constructions for general concept classes yielded size $O(\log |C|)$ for both questions, even when the VC-dimension is constant.

View on arXiv

Comments on this paper