Multi-Domain Active Learning: A Comparative Study
- OOD
Multi-domain learning (MDL) refers to learning a set of models simultaneously, with each one specialized to perform a task in a certain domain. Generally, high labeling effort is required in MDL, as data needs to be labeled by human experts for every domain. To address the above issue, Active learning (AL) can be utilized to reduce the labeling effort by only using the most informative data. The resultant paradigm is termed multi-domain active learning (MDAL). However, despite the practical significance of MDAL, there exists little research on it, not to mention any off-the-shelf solution. To fill this gap, we construct a simple pipeline of MDAL, and present a comprehensive comparative study of 30 different MDAL algorithms, which are established by combining 6 representative MDL models (equipped with various information-sharing schemes) and 5 well-used AL strategies. We evaluate the algorithms on 6 datasets, involving textual and visual classification tasks. In most cases, AL brings notable improvements to MDL, and surprisingly, the naive best vs second best (BvSB) uncertainty strategy could perform competitively to the state-of-the-art AL strategies. Besides, among the MDL models, MAN and SDL-joint achieve the top performance when applied to vector features and raw images, respectively. Furthermore, we qualitatively analyze the behaviors of these strategies and models, shedding light on their superior performance in the comparison. Overall, some guidelines are provided, which could help to choose MDL models and AL strategies for particular applications.
View on arXiv