33

Sparse Regression for Machine Translation

Ergun Biçici
Abstract

We use transductive regression techniques to learn mappings between source and target features of given parallel corpora and use these mappings to generate machine translation outputs. We show the effectiveness of L1L_1 regularized regression (\textit{lasso}) to learn the mappings between sparsely observed feature sets versus L2L_2 regularized regression. Proper selection of training instances plays an important role to learn correct feature mappings within limited computational resources and at expected accuracy levels. We introduce \textit{dice} instance selection method for proper selection of training instances, which plays an important role to learn correct feature mappings for improving the source and target coverage of the training set. We show that L1L_1 regularized regression performs better than L2L_2 regularized regression both in regression measurements and in the translation experiments using graph decoding. We present encouraging results when translating from German to English and Spanish to English. We also demonstrate results when the phrase table of a phrase-based decoder is replaced with the mappings we find with the regression model.

View on arXiv
Comments on this paper