50
0

R-LoRA: Random Initialization of Multi-Head LoRA for Multi-Task Learning

Abstract

Fine-tuning large language models (LLMs) is prohibitively expensive in terms of computational and memory costs. Low-rank Adaptation (LoRA), as one of the most popular parameter-efficient fine-tuning (PEFT) methods, offers a cost-effective alternative by approximating the model changes ΔWRm×n\Delta W \in \mathbb{R}^{m \times n} through the product of down-projection matrix ARm×rA \in \mathbb{R}^{m \times r} and head matrix BRr×nB \in \mathbb{R}^{r \times n}, where rmin(m,n)r \ll \min(m, n). In real-world scenarios, LLMs are fine-tuned on data from multiple domains to perform tasks across various fields, embodying multi-task learning (MTL). LoRA often underperforms in such complex scenarios. To enhance LoRA's capability in multi-task learning, we propose R-LoRA, which incorporates Multi-Head Randomization. Multi-Head Randomization diversifies the head matrices through Multi-Head Random Initialization and Multi-Head Dropout, enabling more efficient learning of task-specific features while maintaining shared knowledge representation. Extensive experiments demonstrate that R-LoRA is better at capturing task-specific knowledge, thereby improving performance in multi-task scenarios. The code is available atthis https URL.

View on arXiv
@article{liu2025_2502.15455,
  title={ R-LoRA: Random Initialization of Multi-Head LoRA for Multi-Task Learning },
  author={ Jinda Liu and Yi Chang and Yuan Wu },
  journal={arXiv preprint arXiv:2502.15455},
  year={ 2025 }
}
Comments on this paper