424
v1v2v3 (latest)

Extracting and Combining Abilities For Building Multi-lingual Ability-enhanced Large Language Models

Main:12 Pages
6 Figures
Bibliography:1 Pages
10 Tables
Appendix:5 Pages
Abstract

Multi-lingual ability transfer has become increasingly important for the broad application of large language models (LLMs). Existing work highly relies on training with the multi-lingual ability-related data, which may not be available for low-resource languages. To solve it, we propose a Multi-lingual Abilities Extraction and Combination approach, named as MAEC. Our key idea is to decompose and extract language-agnostic ability-related weights from LLMs, and combine them across different languages by simple addition and subtraction operations without training. Specifically, our MAEC consists of the extraction and combination stages. In the extraction stage, we firstly locate key neurons that are highly related to specific abilities, and then employ them to extract the transferable ability-related weights. In the combination stage, we further select the ability-related tensors that mitigate the linguistic effects, and design a combining strategy based on them and the language-specific weights, to build the multi-lingual ability-enhanced LLM. To assess the effectiveness of our approach, we conduct extensive experiments on LLaMA-3 8B on mathematical and scientific tasks in both high-resource and low-resource lingual scenarios. Experiment results have shown that MAEC can effectively and efficiently extract and combine the advanced abilities, achieving comparable performance with PaLM. Resources are available atthis https URL.

View on arXiv
Comments on this paper