v1v2v3 (latest)

A Theory of Label Propagation for Subpopulation Shift

International Conference on Machine Learning (ICML), 2021

22 February 2021

Tianle Cai

Ruiqi Gao

Jason D. Lee

Qi Lei

ArXiv (abs)PDF HTML

Abstract

One of the central problems in machine learning is domain adaptation. Unlike past theoretical work, we consider a new model for subpopulation shift in the input or representation space. In this work, we propose a provably effective framework for domain adaptation based on label propagation. In our analysis, we use a simple but realistic expansion assumption, proposed in \citet{wei2021theoretical}. Using a teacher classifier trained on the source domain, our algorithm not only propagates to the target domain but also improves upon the teacher. By leveraging existing generalization bounds, we also obtain end-to-end finite-sample guarantees on the entire algorithm. In addition, we extend our theoretical framework to a more general setting of source-to-target transfer based on a third unlabeled dataset, which can be easily applied in various learning scenarios. Inspired by our theory, we adapt consistency-based semi-supervised learning methods to domain adaptation settings and gain significant improvements.

View on arXiv

Comments on this paper