A Test Suite and Manual Evaluation of Document-Level NMT at WMT19

Kateřina Rysová, Magdaléna Rysová, Tomáš Musil, Lucie Poláková, Ondřej Bojar


Abstract
As the quality of machine translation rises and neural machine translation (NMT) is moving from sentence to document level translations, it is becoming increasingly difficult to evaluate the output of translation systems. We provide a test suite for WMT19 aimed at assessing discourse phenomena of MT systems participating in the News Translation Task. We have manually checked the outputs and identified types of translation errors that are relevant to document-level translation.
Anthology ID:
W19-5352
Volume:
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
455–463
Language:
URL:
https://aclanthology.org/W19-5352
DOI:
10.18653/v1/W19-5352
Bibkey:
Cite (ACL):
Kateřina Rysová, Magdaléna Rysová, Tomáš Musil, Lucie Poláková, and Ondřej Bojar. 2019. A Test Suite and Manual Evaluation of Document-Level NMT at WMT19. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 455–463, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
A Test Suite and Manual Evaluation of Document-Level NMT at WMT19 (Rysová et al., WMT 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-5352.pdf
Poster:
 W19-5352.Poster.pdf
Data
Penn Treebank