MSLC24: Further Challenges for Metrics on a Wide Landscape of Translation Quality

Rebecca Knowles, Samuel Larkin, Chi-Kiu Lo


Abstract
In this second edition of the Metric Score Landscape Challenge (MSLC), we examine how automatic metrics for machine translation perform on a wide variety of machine translation output, ranging from very low quality systems to the types of high-quality systems submitted to the General MT shared task at WMT. We also explore metric results on specific types of data, such as empty strings, wrong- or mixed-language text, and more. We raise several alarms about inconsistencies in metric scores, some of which can be resolved by increasingly explicit instructions for metric use, while others highlight technical flaws.
Anthology ID:
2024.wmt-1.34
Volume:
Proceedings of the Ninth Conference on Machine Translation
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
475–491
Language:
URL:
https://aclanthology.org/2024.wmt-1.34
DOI:
Bibkey:
Cite (ACL):
Rebecca Knowles, Samuel Larkin, and Chi-Kiu Lo. 2024. MSLC24: Further Challenges for Metrics on a Wide Landscape of Translation Quality. In Proceedings of the Ninth Conference on Machine Translation, pages 475–491, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
MSLC24: Further Challenges for Metrics on a Wide Landscape of Translation Quality (Knowles et al., WMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.wmt-1.34.pdf