Detecting Post-Edited References and Their Effect on Human Evaluation

Věra Kloudová, Ondřej Bojar, Martin Popel


Abstract
This paper provides a quick overview of possible methods how to detect that reference translations were actually created by post-editing an MT system. Two methods based on automatic metrics are presented: BLEU difference between the suspected MT and some other good MT and BLEU difference using additional references. These two methods revealed a suspicion that the WMT 2020 Czech reference is based on MT. The suspicion was confirmed in a manual analysis by finding concrete proofs of the post-editing procedure in particular sentences. Finally, a typology of post-editing changes is presented where typical errors or changes made by the post-editor or errors adopted from the MT are classified.
Anthology ID:
2021.humeval-1.13
Volume:
Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)
Month:
April
Year:
2021
Address:
Online
Editors:
Anya Belz, Shubham Agarwal, Yvette Graham, Ehud Reiter, Anastasia Shimorina
Venue:
HumEval
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
114–119
Language:
URL:
https://aclanthology.org/2021.humeval-1.13
DOI:
Bibkey:
Cite (ACL):
Věra Kloudová, Ondřej Bojar, and Martin Popel. 2021. Detecting Post-Edited References and Their Effect on Human Evaluation. In Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval), pages 114–119, Online. Association for Computational Linguistics.
Cite (Informal):
Detecting Post-Edited References and Their Effect on Human Evaluation (Kloudová et al., HumEval 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.humeval-1.13.pdf
Data
WMT 2020