Vacillating Human Correlation of SacreBLEU in Unprotected Languages

Ahrii Kim, Jinhyeon Kim


Abstract
SacreBLEU, by incorporating a text normalizing step in the pipeline, has become a rising automatic evaluation metric in recent MT studies. With agglutinative languages such as Korean, however, the lexical-level metric cannot provide a conceivable result without a customized pre-tokenization. This paper endeavors to ex- amine the influence of diversified tokenization schemes –word, morpheme, subword, character, and consonants & vowels (CV)– on the metric after its protective layer is peeled off. By performing meta-evaluation with manually- constructed into-Korean resources, our empirical study demonstrates that the human correlation of the surface-based metric and other homogeneous ones (as an extension) vacillates greatly by the token type. Moreover, the human correlation of the metric often deteriorates due to some tokenization, with CV one of its culprits. Guiding through the proper usage of tokenizers for the given metric, we discover i) the feasibility of the character tokens and ii) the deficit of CV in the Korean MT evaluation.
Anthology ID:
2022.humeval-1.1
Volume:
Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Anya Belz, Maja Popović, Ehud Reiter, Anastasia Shimorina
Venue:
HumEval
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–15
Language:
URL:
https://aclanthology.org/2022.humeval-1.1
DOI:
10.18653/v1/2022.humeval-1.1
Bibkey:
Cite (ACL):
Ahrii Kim and Jinhyeon Kim. 2022. Vacillating Human Correlation of SacreBLEU in Unprotected Languages. In Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval), pages 1–15, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Vacillating Human Correlation of SacreBLEU in Unprotected Languages (Kim & Kim, HumEval 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.humeval-1.1.pdf
Video:
 https://aclanthology.org/2022.humeval-1.1.mp4