Comparative Quality Assessment of Human and Machine Translation with Best-Worst Scaling

Bettina Hiebl, Dagmar Gromann


Abstract
Translation quality and its assessment are of great importance in the context of human as well as machine translation. Methods range from human annotation and assessment to quality metrics and estimation, where the former are rather time-consuming. Furthermore, assessing translation quality is a subjective process. Best-Worst Scaling (BWS) represents a time-efficient annotation method to obtain subjective preferences, the best and the worst in a given set and their ratings. In this paper, we propose to use BWS for a comparative translation quality assessment of one human and three machine translations to German of the same source text in English. As a result, ten participants with a translation background selected the human translation most frequently and rated it overall as best closely followed by DeepL. Participants showed an overall positive attitude towards this assessment method.
Anthology ID:
2024.eamt-1.42
Volume:
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)
Month:
June
Year:
2024
Address:
Sheffield, UK
Editors:
Carolina Scarton, Charlotte Prescott, Chris Bayliss, Chris Oakley, Joanna Wright, Stuart Wrigley, Xingyi Song, Edward Gow-Smith, Rachel Bawden, Víctor M Sánchez-Cartagena, Patrick Cadwell, Ekaterina Lapshinova-Koltunski, Vera Cabarrão, Konstantinos Chatzitheodorou, Mary Nurminen, Diptesh Kanojia, Helena Moniz
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation (EAMT)
Note:
Pages:
507–536
Language:
URL:
https://aclanthology.org/2024.eamt-1.42
DOI:
Bibkey:
Cite (ACL):
Bettina Hiebl and Dagmar Gromann. 2024. Comparative Quality Assessment of Human and Machine Translation with Best-Worst Scaling. In Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), pages 507–536, Sheffield, UK. European Association for Machine Translation (EAMT).
Cite (Informal):
Comparative Quality Assessment of Human and Machine Translation with Best-Worst Scaling (Hiebl & Gromann, EAMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eamt-1.42.pdf