76
12

Teaching and compressing for low VC-dimension

Abstract

In this work we study the quantitative relation between VC-dimension and two other basic parameters related to learning and teaching. We present relatively efficient constructions of {\em sample compression schemes} and {\em teaching sets} for classes of low VC-dimension. Let CC be a finite boolean concept class of VC-dimension dd. Set k=O(d2dloglogC)k = O(d 2^d \log \log |C|). We construct sample compression schemes of size kk for CC, with additional information of klog(k)k \log(k) bits. Roughly speaking, given any list of CC-labelled examples of arbitrary length, we can retain only kk labeled examples in a way that allows to recover the labels of all others examples in the list. We also prove that there always exists a concept cc in CC with a teaching set (i.e. a list of cc-labelled examples uniquely identifying cc) of size kk. Equivalently, we prove that the recursive teaching dimension of CC is at most kk. The question of constructing sample compression schemes for classes of small VC-dimension was suggested by Littlestone and Warmuth (1986), and the problem of constructing teaching sets for classes of small VC-dimension was suggested by Kuhlmann (1999). Previous constructions for general concept classes yielded size O(logC)O(\log |C|) for both questions, even when the VC-dimension is constant.

View on arXiv
Comments on this paper