Transfer Learning across Low-Resource, Related Languages for Neural
Machine Translation
Abstract
We present a simple method to improve neural translation of a low-resource language pair using parallel data from a related, also low-resource one. The method is based on the transfer method of Zoph et al., but whereas their method ignores any source vocabulary overlap, ours exploits it. First, we split words using Byte Pair Encoding (BPE) to increase vocabulary overlap. Then, we train a model on the first language pair and transfer its parameters, including its source word embeddings, to another model and continue training on the second language pair. Our experiments show that while BPE and transfer learning perform inconsistently on their own, together they improve translation quality by up to 1.8 BLEU.
View on arXivComments on this paper
