Miia Rämö
2021
Using contextual and cross-lingual word embeddings to improve variety in template-based NLG for automated journalism
Miia Rämö
|
Leo Leppänen
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation
In this work, we describe our efforts in improving the variety of language generated from a rule-based NLG system for automated journalism. We present two approaches: one based on inserting completely new words into sentences generated from templates, and another based on replacing words with synonyms. Our initial results from a human evaluation conducted in English indicate that these approaches successfully improve the variety of the language without significantly modifying sentence meaning. We also present variations of the methods applicable to low-resource languages, simulated here using Finnish, where cross-lingual aligned embeddings are harnessed to make use of linguistic resources in a high-resource language. A human evaluation indicates that while proposed methods show potential in the low-resource case, additional work is needed to improve their performance.