The Limitations of Optimization from Samples

19 December 2015

Abstract

In this paper we consider the following question: can we optimize decisions on models learned from data and be guaranteed that we achieve desirable outcomes? We formalize this question through a novel framework called optimization from samples (OPS). In the OPS framework, we are given sampled values of a function drawn from some distribution and the objective is to optimize the function under some constraint. We show that there are classes of functions which have desirable learnability and optimizability guarantees and for which no reasonable approximation for optimization from samples is achievable. In particular, our main result shows that even for maximization of coverage functions under a cardinality constraint $k$ , there exists a hypothesis class of functions that cannot be approximated within a factor of $n^{-1/4 + \epsilon}$ (for any constant $\epsilon > 0$ ) of the optimal solution, from samples drawn from the uniform distribution over all sets of size at most $k$ . In the general case of monotone submodular functions, we show an $n^{-1/3 + \epsilon}$ lower bound and an almost matching $\tilde{\Omega}(n^{-1/3})$ -optimization from samples algorithm. On the positive side, if a monotone subadditive function has bounded curvature we obtain desirable guarantees. We also show that additive and unit-demand functions can be optimized from samples to within arbitrarily good precision, and that budget additive functions can be optimized from samples to a factor of 1/2.

View on arXiv

Comments on this paper