MetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task

Juraj Juraska; Mara Finkelstein; Daniel Deutsch; Aditya Siddhant; Mehdi Mirzazadeh; Markus Freitag

doi:10.18653/v1/2023.wmt-1.63

MetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task

Juraj Juraska, Mara Finkelstein, Daniel Deutsch, Aditya Siddhant, Mehdi Mirzazadeh, Markus Freitag

Abstract

This report details the MetricX-23 submission to the WMT23 Metrics Shared Task and provides an overview of the experiments that informed which metrics were submitted. Our 3 submissions—each with a quality estimation (or reference-free) version—are all learned regression-based metrics that vary in the data used for training and which pretrained language model was used for initialization. We report results related to understanding (1) which supervised training data to use, (2) the impact of how the training labels are normalized, (3) the amount of synthetic training data to use, (4) how metric performance is related to model size, and (5) the effect of initializing the metrics with different pretrained language models. The most successful training recipe for MetricX employs two-stage fine-tuning on DA and MQM ratings, and includes synthetic training data. Finally, one important takeaway from our extensive experiments is that optimizing for both segment- and system-level performance at the same time is a challenging task.

Anthology ID:: 2023.wmt-1.63
Volume:: Proceedings of the Eighth Conference on Machine Translation
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 756–767
Language:
URL:: https://aclanthology.org/2023.wmt-1.63
DOI:: 10.18653/v1/2023.wmt-1.63
Bibkey:
Cite (ACL):: Juraj Juraska, Mara Finkelstein, Daniel Deutsch, Aditya Siddhant, Mehdi Mirzazadeh, and Markus Freitag. 2023. MetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task. In Proceedings of the Eighth Conference on Machine Translation, pages 756–767, Singapore. Association for Computational Linguistics.
Cite (Informal):: MetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task (Juraska et al., WMT 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.wmt-1.63.pdf

PDF Cite Search