VersaTune: Harnessing Vertical Domain Insights for Multi-Ability LLM Supervised Fine-Tuning
- VLMCLL
Large Language Models (LLMs) exhibit remarkable capabilities in handling multiple tasks across domains due to their emergent properties. These capabilities are further augmented during the Supervised Fine-Tuning (SFT) phase. Despite their potential, existing work mainly focuses on domain-specific enhancements during fine-tuning, the challenge of which lies in catastrophic forgetting of knowledge across other domains. In this study, we introduce VersaTune, a novel data composition framework designed for enhancing LLMs' overall multi-ability performances during fine-tuning. We categorize knowledge into distinct domains including law, medicine, finance, science, code. We begin with detecting the distribution of domain-specific knowledge within the base model, followed by the composition of training data that aligns with the model's existing knowledge distribution. During the fine-tuning process, weights of different domains are dynamically adjusted based on their learnable potential and forgetting degree. Experimental results demonstrate that VersaTune achieves significant improvements in multi-domain performance, with a 35.21% enhancement in comprehensive multi-domain tasks. Additionally, in scenarios where specific domain optimization is required, VersaTune reduces the degradation of performance in other domains by 38.77%, without compromising the target domain's training efficacy.
View on arXiv@article{lu2025_2411.11266, title={ VersaTune: An Efficient Data Composition Framework for Training Multi-Capability LLMs }, author={ Keer Lu and Keshi Zhao and Zhuoran Zhang and Zheng Liang and Da Pan and Shusen Zhang and Xin Wu and Guosheng Dong and Bin Cui and Tengjiao Wang and Wentao Zhang }, journal={arXiv preprint arXiv:2411.11266}, year={ 2025 } }