%0 Conference Proceedings
%T Towards Document-Level Human MT Evaluation: On the Issues of Annotator Agreement, Effort and Misevaluation
%A Castilho, Sheila
%Y Belz, Anya
%Y Agarwal, Shubham
%Y Graham, Yvette
%Y Reiter, Ehud
%Y Shimorina, Anastasia
%S Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)
%D 2021
%8 April
%I Association for Computational Linguistics
%C Online
%F castilho-2021-towards
%X Document-level human evaluation of machine translation (MT) has been raising interest in the community. However, little is known about the issues of using document-level methodologies to assess MT quality. In this article, we compare the inter-annotator agreement (IAA) scores, the effort to assess the quality in different document-level methodologies, and the issue of misevaluation when sentences are evaluated out of context.
%U https://aclanthology.org/2021.humeval-1.4
%P 34-45