v1v2 (latest)
Evaluating the Morphosyntactic Well-formedness of Generated Texts
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Graham Neubig
Abstract
Text generation systems are ubiquitous in natural language processing applications. However, evaluation of these systems remains a challenge, especially in multilingual settings. In this paper, we propose LÁMBRE -- a metric to evaluate the morphosyntactic well-formedness of text using its dependency parse and morphosyntactic rules of the language. We present a way to automatically extract various rules governing morphosyntax directly from dependency treebanks. To tackle the noisy outputs from text generation systems, we propose a simple methodology to train robust parsers. We show the effectiveness of our metric on the task of machine translation through a diachronic study of systems translating into morphologically-rich languages.
View on arXivComments on this paper
