Cometoid: Distilling Strong Reference-based Machine Translation Metrics into Even Stronger Quality Estimation Metrics

Thamme Gowda, Tom Kocmi, Marcin Junczys-Dowmunt


Abstract
This paper describes our submissions to the 2023 Conference on Machine Translation (WMT-23) Metrics shared task. Knowledge distillation is commonly used to create smaller student models that mimic larger teacher model while reducing the model size and hence inference cost in production. In this work, we apply knowledge distillation to machine translation evaluation metrics and distill existing reference-based teacher metrics into reference-free (quality estimation; QE) student metrics. We mainly focus on students of Unbabel’s COMET22 reference-based metric. When evaluating on the official WMT-22 Metrics evaluation task, our distilled Cometoid QE metrics outperform all other QE metrics on that set while matching or out-performing the reference-based teacher metric. Our metrics never see the human ground-truth scores directly – only the teacher metric was trained on human scores by its original creators. We also distill ChrF sentence-level scores into a neural QE metric and find that our reference-free (and fully human-score-free) student metric ChrFoid outperforms its teacher metric by over 7% pairwise accuracy on the same WMT-22 task, rivaling other existing QE metrics.
Anthology ID:
2023.wmt-1.62
Volume:
Proceedings of the Eighth Conference on Machine Translation
Month:
December
Year:
2023
Address:
Singapore
Editors:
Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
751–755
Language:
URL:
https://aclanthology.org/2023.wmt-1.62
DOI:
10.18653/v1/2023.wmt-1.62
Bibkey:
Cite (ACL):
Thamme Gowda, Tom Kocmi, and Marcin Junczys-Dowmunt. 2023. Cometoid: Distilling Strong Reference-based Machine Translation Metrics into Even Stronger Quality Estimation Metrics. In Proceedings of the Eighth Conference on Machine Translation, pages 751–755, Singapore. Association for Computational Linguistics.
Cite (Informal):
Cometoid: Distilling Strong Reference-based Machine Translation Metrics into Even Stronger Quality Estimation Metrics (Gowda et al., WMT 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.wmt-1.62.pdf