36
26

Cross-Lingual Sentiment Quantification

Abstract

We discuss \emph{Cross-Lingual Text Quantification} (CLTQ), the task of performing text quantification (i.e., estimating the relative frequency pc(D)p_{c}(D) of all classes cCc\in\mathcal{C} in a set DD of unlabelled documents) when training documents are available for a source language S\mathcal{S} but not for the target language T\mathcal{T} for which quantification needs to be performed. CLTQ has never been discussed before in the literature; we establish baseline results for the binary case by combining state-of-the-art quantification methods with methods capable of generating cross-lingual vectorial representations of the source and target documents involved. We present experimental results obtained on publicly available datasets for cross-lingual sentiment classification; the results show that the presented methods can perform CLTQ with a surprising level of accuracy.

View on arXiv
Comments on this paper