Offline Recommender Learning Meets Unsupervised Domain Adaptation
- OffRL
It is critical to eliminate selection bias of the rating feedback to construct a well-performing recommender offline. Currently, a promising solution to the challenge is the propensity weighting approach that models the missing mechanism of rating feedback. However, the performance of existing propensity-based algorithms can be significantly affected by the propensity estimation bias. To alleviate the problem, we formulate the missing-not-at-random recommendation as the unsupervised domain adaptation problem and drive the propensity-independent generalization error bound. We further propose a corresponding algorithm that minimizes the bound via adversarial learning. Our proposed theoretical framework and algorithm do not depend on the propensity score and can obtain a well-performing rating predictor without the true propensity information. Empirical evaluation using benchmark real-world datasets demonstrates the effectiveness and real-world applicability of the proposed approach.
View on arXiv