Amortized Prompt: Guide CLIP to Domain Transfer Learning
- VLM
Domain generalization (DG) is a problematic Domain Transfer Learning problem aiming to learn a generalizable model to unseen domains. Recent massive pre-trained models such as CLIP and GPT-3, i.e. foundation models (FMs), are robust to many distribution shifts and therefore should lead to substantial improvements in DG. In this work, we study generic ways to adopt CLIP for DG problems in image classification. We evaluate Test-Time Adaptation (TTA) and full DG learning settings on several standard benchmarks. We propose AP (Amortized Prompt) as a novel prompt strategy for domain inference in the form of prompt generation. Moreover, we show that combining domain prompt inference with CLIP enables the model to outperform strong DG baselines and other prompt strategies. Since AP generate prompts to automatically adapt to the target domain, it can be seen as a TTA method. Therefore, we also conduct a fair comparison with SOTA TTA methods. The results demonstrate AP can outperform all baselines with a significant margin. Then, we further analyze the properties of AP with insightful ablation experiments. We hope the simplicity and success of our approach emphasize the importance of and lead to broader adoption and analysis of foundation models in the field of TTA and DG.
View on arXiv