VeFA: Vector-Based Feature Space Adaptation for Robust Model Fine-Tuning
Catastrophic forgetting is a well-documented challenge in model fine-tuning, particularly when the downstream domain has limited labeled data or differs substantially from the pre-training distribution. Existing parameter-efficient fine-tuning methods largely operate in the weight space by modifying or augmenting the parameters of the pre-trained model, which can lead to models that are overly specialized to the observed downstream data. Recent studies suggest that one mechanism underlying such forgetting is the introduction of intruder dimensions into the representation space during fine-tuning. To mitigate the risk of overwriting pre-trained knowledge and to enhance robustness, we propose Vector-based Feature Adaptation (VeFA), a new fine-tuning method that operates directly in the feature space, which naturally avoids generating intruder dimensions. VeFA performs element-wise adaptation on individual features, thereby ensuring that the effective fine-tuned weights always remain within the column space of the pre-trained weight matrix. This feature-space adaptation perspective is inspired by the idea of effect equivalence modeling (EEM) of downstream lurking variables that induce distribution shifts, which posits that the influence of unobserved factors can be represented as an equivalent aggregate effect on observed features. By compensating for the effects of downstream lurking variables via a lightweight feature-level transformation, VeFA preserves the pre-trained representations and improves model generalization under distribution shift. We evaluate VeFA against LoRA on image classification, NLU, and NLG benchmarks, considering both standard fine-tuning performance and robustness; across these tasks, VeFA achieves comparable fine-tuning performance while consistently exhibiting stronger robustness.
View on arXiv