Picking Out The Best MT Model: On The Methodology Of Human Evaluation

Stepan Korotaev, Andrey Ryabchikov


Abstract
Human evaluation remains a critical step in selecting the best MT model for a job. The common approach is to have a reviewer analyze a number of segments translated by the compared models, assigning them categories and also post-editing some of them when needed. In other words, a reviewer is asked to make numerous decisions regarding very similar, out-of-context translations. It can easily result in arbitrary choices. We propose a new methodology that is centered around a real-life post-editing of a set of cohesive homogeneous texts. The homogeneity is established using a number of metrics on a set of preselected same-genre documents. The key assumption is that two or more identical in length homogeneous texts take approximately the same time and effort when edited by the same editor. Hence, if one text requires more work (edit distance, time spent), it is an indication of a relatively lower quality of machine translation used for this text. See details in the attached file.
Anthology ID:
2022.amta-upg.2
Volume:
Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track)
Month:
September
Year:
2022
Address:
Orlando, USA
Editors:
Janice Campbell, Stephen Larocca, Jay Marciano, Konstantin Savenkov, Alex Yanishevsky
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
12–23
Language:
URL:
https://aclanthology.org/2022.amta-upg.2
DOI:
Bibkey:
Cite (ACL):
Stepan Korotaev and Andrey Ryabchikov. 2022. Picking Out The Best MT Model: On The Methodology Of Human Evaluation. In Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track), pages 12–23, Orlando, USA. Association for Machine Translation in the Americas.
Cite (Informal):
Picking Out The Best MT Model: On The Methodology Of Human Evaluation (Korotaev & Ryabchikov, AMTA 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.amta-upg.2.pdf
Code
 effectiff-tech/homogeneity-scripts