Generalization Bounds with Minimal Dependency on Hypothesis Class via Distributionally Robust Optimization

Neural Information Processing Systems (NeurIPS), 2021

21 June 2021

Abstract

Established approaches to obtain generalization bounds in data-driven optimization and machine learning mostly build on solutions from empirical risk minimization (ERM), which depend crucially on the functional complexity of the hypothesis class. In this paper, we present an alternate route to obtain these bounds on the solution from distributionally robust optimization (DRO), a recent data-driven optimization framework based on worst-case analysis and the notion of ambiguity set to capture statistical uncertainty. In contrast to the hypothesis class complexity in ERM, our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function. Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies generalization bounds that depend solely on the true loss function. To the best of our knowledge, it is the first generalization bound in the literature that is entirely independent of any other candidates in the hypothesis class. We hope our findings can open the door for a better understanding of DRO, especially its benefits on loss minimization and other machine learning applications.

View on arXiv

Comments on this paper