Wasserstein Distributionally Robust Optimization with Heterogeneous Data
Sources
We study decision problems under uncertainty, where the decision-maker has access to data sources that carry {\em biased} information about the underlying risk factors. The biases are measured by the mismatch between the risk factor distribution and the data-generating distributions with respect to an optimal transport (OT) distance. In this situation the decision-maker can exploit the information contained in the biased samples by solving a distributionally robust optimization (DRO) problem, where the ambiguity set is defined as the intersection of OT neighborhoods, each of which is centered at the empirical distribution on the samples generated by a biased data source. We show that if the decision-maker has a prior belief about the biases, then the out-of-sample performance of the DRO solution can improve with -- irrespective of the magnitude of the biases. We also show that, under standard convexity assumptions, the proposed DRO problem is computationally tractable if either or the dimension of the risk factors is kept constant.
View on arXiv