Why Train Everything? Tint a Single Layer for Multi-task Model Merging

26 December 2024

Aecheon Jung

ArXiv (abs)PDF HTML HuggingFace (3 upvotes)Github (12705★)

Main:9 Pages

16 Figures

Bibliography:4 Pages

15 Tables

Appendix:12 Pages

Abstract

Model merging integrates independently fine-tuned models into a single multi-task model, offering a flexible alternative to joint training. However, many existing model merging methods introduce additional task-specific components, increasing complexity and requiring extra modifications. We propose Model Tinting, a lightweight yet highly effective approach that improves model merging by updating just a single layer, accounting for as low as 0.5% of total parameters. Our key observation is that explicit task-specific modules are not necessary; instead, subtle adjustments to a single layer can effectively capture task-specific variations within the merged model while maintaining generalization. We introduce a confidence-based filtering mechanism to alleviate the impact of unreliable predictions from individual models on the merged model. Extensive experiments across vision and NLP tasks demonstrate that Model Tinting achieves state-of-the-art performance, even in challenging dense prediction tasks. Our code is available atthis https URL

View on arXiv

Comments on this paper