18
1

Data-driven Natural Language Generation: Paving the Road to Success

Jekaterina Novikova
Ondrej Dusek
Verena Rieser
Abstract

We argue that there are currently two major bottlenecks to the commercial use of statistical machine learning approaches for natural language generation (NLG): (a) The lack of reliable automatic evaluation metrics for NLG, and (b) The scarcity of high quality in-domain corpora. We address the first problem by thoroughly analysing current evaluation metrics and motivating the need for a new, more reliable metric. The second problem is addressed by presenting a novel framework for developing and evaluating a high quality corpus for NLG training.

View on arXiv
Comments on this paper