Theoretical Perspective of Deep Domain Adaptation
Deep domain adaptation has been applied successfully in many applications of machine learning. Comparing with its shallow rivals, deep domain adaptation approach has generally resulted in higher predictive performance and better at modeling rich structural data (e.g., image and sequential data). The underlying idea is to bridge the gap between the source and target domains in a joint feature space so that a supervised classifier trained on labeled source data can be nicely transferred to the target domain. While this idea is appealing and intuitive, its theoretical underpinnings are woefully incomplete. Our goal in this paper is to develop a rigorous framework to study and explain why such a gap in the intermediate joint space can be formulated and minimized. More specifically, we first study the loss incurred during the transfer learning from the source to the target domain. This lays a foundation for our next contribution in rigorously explaining why closing the gap between the two domains in the joint feature space can directly minimize the loss incurred during transfer classification learning between two domains. We provide concrete theoretical results to quantify such gaps and corresponding transfer learning reduction in classification errors.
View on arXiv