116

Graph-based Discriminators: Sample Complexity and Expressiveness

Neural Information Processing Systems (NeurIPS), 2019
Abstract

A basic question in learning theory is to identify if two distributions are identical when we have access only to examples sampled from the distributions. This basic task is considered, for example, in the context of Generative Adversarial Networks (GANs), where a discriminator is trained to distinguish between a real-life distribution and a synthetic distribution. % Classically, we use a hypothesis class HH and claim that the two distributions are distinct if for some hHh\in H the expected value on the two distributions is (significantly) different. Our starting point is the following fundamental problem: "is having the hypothesis dependent on more than a single random example beneficial". To address this challenge we define kk-ary based discriminators, which have a family of Boolean kk-ary functions G\mathcal{G}. Each function gGg\in \mathcal{G} naturally defines a hyper-graph, indicating whether a given hyper-edge exists. A function gGg\in \mathcal{G} distinguishes between two distributions, if the expected value of gg, on a kk-tuple of i.i.d examples, on the two distributions is (significantly) different. We study the expressiveness of families of kk-ary functions, compared to the classical hypothesis class HH, which is k=1k=1. We show a separation in expressiveness of k+1k+1-ary versus kk-ary functions. This demonstrate the great benefit of having k2k\geq 2 as distinguishers. For k2k\geq 2 we introduce a notion similar to the VC-dimension, and show that it controls the sample complexity. We proceed and provide upper and lower bounds as a function of our extended notion of VC-dimension.

View on arXiv
Comments on this paper