Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations

Marco Antonio Sobrevilla Cabezudo, Simon Mille, Thiago Pardo


Abstract
This paper presents an exploratory study that aims to evaluate the usefulness of back-translation in Natural Language Generation (NLG) from semantic representations for non-English languages. Specifically, Abstract Meaning Representation and Brazilian Portuguese (BP) are chosen as semantic representation and language, respectively. Two methods (focused on Statistical and Neural Machine Translation) are evaluated on two datasets (one automatically generated and another one human-generated) to compare the performance in a real context. Also, several cuts according to quality measures are performed to evaluate the importance (or not) of the data quality in NLG. Results show that there are still many improvements to be made but this is a promising approach.
Anthology ID:
D19-6313
Volume:
Proceedings of the 2nd Workshop on Multilingual Surface Realisation (MSR 2019)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Leo Wanner
Venue:
WS
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
94–103
Language:
URL:
https://aclanthology.org/D19-6313
DOI:
10.18653/v1/D19-6313
Bibkey:
Cite (ACL):
Marco Antonio Sobrevilla Cabezudo, Simon Mille, and Thiago Pardo. 2019. Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations. In Proceedings of the 2nd Workshop on Multilingual Surface Realisation (MSR 2019), pages 94–103, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations (Sobrevilla Cabezudo et al., 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-6313.pdf