ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.12499
61
0

LinguaLIFT: An Effective Two-stage Instruction Tuning Framework for Low-Resource Language Reasoning

17 December 2024
Hongbin Zhang
K. Chen
Xuefeng Bai
Yang Xiang
Min Zhang
ArXivPDFHTML
Abstract

Large language models (LLMs) have exhibited impressive multilingual reasoning capabilities, driven by extensive multilingual pre-training corpora and instruction fine-tuning data. However, a performance gap exists between high- and low-resource language reasoning tasks due to the language imbalance in the pre-training corpus, which is exacerbated by evaluation bias in existing reasoning benchmarks lacking low-resource language coverage. To alleviate this issue, we propose LinguaLIFT, a two-stage instruction tuning framework for advancing low-resource language reasoning. LinguaLIFT employs a language alignment layer to capture multilingual alignment in a code-switched tuning way without requiring multilingual instruction or parallel data, thereby transferring the cross-lingual reasoning capabilities to low-resource languages through English-only instruction tuning data. To comprehensively evaluate the multilingual reasoning capabilities, we introduce the Multilingual Math World Problem (MMWP) benchmark, which spans 21 low-resource, 17 medium-resource, and 10 high-resource languages. Experimental results show that LinguaLIFT outperforms several competitive baselines across MMWP and four widely used benchmarks.

View on arXiv
@article{zhang2025_2412.12499,
  title={ LinguaLIFT: An Effective Two-stage Instruction Tuning Framework for Low-Resource Language Reasoning },
  author={ Hongbin Zhang and Kehai Chen and Xuefeng Bai and Yang Xiang and Min Zhang },
  journal={arXiv preprint arXiv:2412.12499},
  year={ 2025 }
}
Comments on this paper