Evaluation Dataset for Lexical Translation Consistency in Chinese-to-English Document-level Translation

Xiangyu Lei; Junhui Li (李军辉); Shimin Tao; Hao Yang

Evaluation Dataset for Lexical Translation Consistency in Chinese-to-English Document-level Translation

Xiangyu Lei, Junhui Li, Shimin Tao, Hao Yang

Abstract

Lexical translation consistency is one of the most common discourse phenomena in Chinese-to-English document-level translation. To better evaluate the performance of lexical translation consistency, previous researches assumes that all repeated source words should be translated consistently. However, constraining translations of repeated source words to be consistent will hurt word diversity and human translators tend to use different words in translation. Therefore, in this paper we construct a test set of 310 bilingual news articles to properly evaluate lexical translation consistency. We manually differentiate those repeated source words whose translations are consistent into two types: true consistency and false consistency. Then based on the constructed test set, we evaluate the performance of lexical translation consistency for several typical NMT systems.

Anthology ID:: 2024.lrec-main.583
Volume:: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:: LREC | COLING
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 6575–6581
Language:
URL:: https://aclanthology.org/2024.lrec-main.583/
DOI:
Bibkey:
Cite (ACL):: Xiangyu Lei, Junhui Li, Shimin Tao, and Hao Yang. 2024. Evaluation Dataset for Lexical Translation Consistency in Chinese-to-English Document-level Translation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 6575–6581, Torino, Italia. ELRA and ICCL.
Cite (Informal):: Evaluation Dataset for Lexical Translation Consistency in Chinese-to-English Document-level Translation (Lei et al., LREC-COLING 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.lrec-main.583.pdf

PDF Cite Search Fix data