v1v2 (latest)

Multilingual Factor Analysis

Annual Meeting of the Association for Computational Linguistics (ACL), 2019

14 May 2019

Francisco Vargas

Kamen Brestnichki

Alexandros Papadopoulos-Korfiatis

Nils Y. Hammerla

ArXiv (abs)PDF HTML

Abstract

In this work we approach the task of learning multilingual word representations in an offline manner by fitting a generative latent variable model to a multilingual dictionary. We model equivalent words in different languages as different views of the same word generated by a common latent variable representing their latent lexical meaning. We explore the task of alignment by querying the fitted model for multilingual embeddings achieving competitive results across a variety of tasks. The proposed model is robust to noise in the embedding space making it a suitable method for distributed representations learned from noisy corpora.

View on arXiv

Comments on this paper