49

Wasserstein Distributionally Robust Optimization with Heterogeneous Data Sources

Abstract

We study decision problems under uncertainty, where the decision-maker has access to KK data sources that carry {\em biased} information about the underlying risk factors. The biases are measured by the mismatch between the risk factor distribution and the KK data-generating distributions with respect to an optimal transport (OT) distance. In this situation the decision-maker can exploit the information contained in the biased samples by solving a distributionally robust optimization (DRO) problem, where the ambiguity set is defined as the intersection of KK OT neighborhoods, each of which is centered at the empirical distribution on the samples generated by a biased data source. We show that if the decision-maker has a prior belief about the biases, then the out-of-sample performance of the DRO solution can improve with KK -- irrespective of the magnitude of the biases. We also show that, under standard convexity assumptions, the proposed DRO problem is computationally tractable if either KK or the dimension of the risk factors is kept constant.

View on arXiv
Comments on this paper