Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation

Model merging integrates the parameters of multiple models into a unified model, combining their diverse capabilities. Existing model merging methods are often constrained by fixed parameter merging ratios. In this study, we propose Mixup Model Merge (M), an innovative approach inspired by the Mixup data augmentation technique. This method merges the parameters of two large language models (LLMs) by randomly generating linear interpolation ratios, allowing for a more flexible and comprehensive exploration of the parameter space. Extensive experiments demonstrate the superiority of our proposed M method in merging fine-tuned LLMs: (1) it significantly improves performance across multiple tasks, (2) it enhances LLMs' out-of-distribution (OOD) robustness and adversarial robustness, (3) it achieves superior results when combined with sparsification techniques such as DARE, and (4) it offers a simple yet efficient solution that does not require additional computational resources. In conclusion, M is a simple yet effective model merging method that significantly enhances the performance of the merged model by randomly generating contribution ratios for two fine-tuned LLMs. The code is available atthis https URL.
View on arXiv@article{zhou2025_2502.15434, title={ Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation }, author={ Yue Zhou and Yi Chang and Yuan Wu }, journal={arXiv preprint arXiv:2502.15434}, year={ 2025 } }