ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.18250
53
0

PAD: Towards Efficient Data Generation for Transfer Learning Using Phrase Alignment

24 March 2025
Jong Myoung Kim
Young-Jun_Lee
Ho-Jin Choi
Sangkeun Jung
ArXivPDFHTML
Abstract

Transfer learning leverages the abundance of English data to address the scarcity of resources in modeling non-English languages, such as Korean. In this study, we explore the potential of Phrase Aligned Data (PAD) from standardized Statistical Machine Translation (SMT) to enhance the efficiency of transfer learning. Through extensive experiments, we demonstrate that PAD synergizes effectively with the syntactic characteristics of the Korean language, mitigating the weaknesses of SMT and significantly improving model performance. Moreover, we reveal that PAD complements traditional data construction methods and enhances their effectiveness when combined. This innovative approach not only boosts model performance but also suggests a cost-efficient solution for resource-scarce languages.

View on arXiv
@article{kim2025_2503.18250,
  title={ PAD: Towards Efficient Data Generation for Transfer Learning Using Phrase Alignment },
  author={ Jong Myoung Kim and Young-Jun_Lee and Ho-Jin Choi and Sangkeun Jung },
  journal={arXiv preprint arXiv:2503.18250},
  year={ 2025 }
}
Comments on this paper