Shape of Synth to Come: Why We Should Use Synthetic Data for English Surface Realization

Henry Elder, Robert Burke, Alexander O’Connor, Jennifer Foster


Abstract
The Surface Realization Shared Tasks of 2018 and 2019 were Natural Language Generation shared tasks with the goal of exploring approaches to surface realization from Universal-Dependency-like trees to surface strings for several languages. In the 2018 shared task there was very little difference in the absolute performance of systems trained with and without additional, synthetically created data, and a new rule prohibiting the use of synthetic data was introduced for the 2019 shared task. Contrary to the findings of the 2018 shared task, we show, in experiments on the English 2018 dataset, that the use of synthetic data can have a substantial positive effect – an improvement of almost 8 BLEU points for a previously state-of-the-art system. We analyse the effects of synthetic data, and we argue that its use should be encouraged rather than prohibited so that future research efforts continue to explore systems that can take advantage of such data.
Anthology ID:
2020.acl-main.665
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7465–7471
Language:
URL:
https://aclanthology.org/2020.acl-main.665
DOI:
10.18653/v1/2020.acl-main.665
Bibkey:
Cite (ACL):
Henry Elder, Robert Burke, Alexander O’Connor, and Jennifer Foster. 2020. Shape of Synth to Come: Why We Should Use Synthetic Data for English Surface Realization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7465–7471, Online. Association for Computational Linguistics.
Cite (Informal):
Shape of Synth to Come: Why We Should Use Synthetic Data for English Surface Realization (Elder et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.665.pdf
Video:
 http://slideslive.com/38929432
Code
 Henry-E/surface-realization-shallow-task