Document-Level Machine Translation Evaluation Project: Methodology, Effort and Inter-Annotator Agreement

Sheila Castilho


Abstract
Document-level (doc-level) human eval-uation of machine translation (MT) has raised interest in the community after a fewattempts have disproved claims of “human parity” (Toral et al., 2018; Laubli et al.,2018). However, little is known about bestpractices regarding doc-level human evalu-ation. The goal of this project is to identifywhich methodologies better cope with i)the current state-of-the-art (SOTA) humanmetrics, ii) a possible complexity when as-signing a single score to a text consisted of‘good’ and ‘bad’ sentences, iii) a possibletiredness bias in doc-level set-ups, and iv)the difference in inter-annotator agreement(IAA) between sentence and doc-level set-ups.
Anthology ID:
2020.eamt-1.49
Volume:
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
Month:
November
Year:
2020
Address:
Lisboa, Portugal
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
455–456
Language:
URL:
https://aclanthology.org/2020.eamt-1.49
DOI:
Bibkey:
Cite (ACL):
Sheila Castilho. 2020. Document-Level Machine Translation Evaluation Project: Methodology, Effort and Inter-Annotator Agreement. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 455–456, Lisboa, Portugal. European Association for Machine Translation.
Cite (Informal):
Document-Level Machine Translation Evaluation Project: Methodology, Effort and Inter-Annotator Agreement (Castilho, EAMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.eamt-1.49.pdf