COMET-poly: Machine Translation Metric Grounded in Other Candidates

Maike Züfle; Vilém Zouhar; Tu Anh Dinh; Felipe Maia Polo; Jan Niehues; Mrinmaya Sachan

doi:10.18653/v1/2025.wmt-1.63

COMET-poly: Machine Translation Metric Grounded in Other Candidates

Maike Züfle, Vilém Zouhar, Tu Anh Dinh, Felipe Maia Polo, Jan Niehues, Mrinmaya Sachan

Abstract

Automated metrics for machine translation attempt to replicate human judgment. Unlike humans, who often assess a translation in the context of multiple alternatives, these metrics typically consider only the source sentence and a single translation. This discrepancy in the evaluation setup may negatively impact the performance of automated metrics. We propose two automated metrics that incorporate additional information beyond the single translation. COMET-polycand uses alternative translations of the same source sentence to compare and contrast with the translation at hand, thereby providing a more informed assessment of its quality. COMET-polyic, inspired by retrieval-based in-context learning, takes in translations of similar source texts along with their human-labeled quality scores to guide the evaluation. We find that including a single additional translation in COMET-polycand improves the segment-level metric performance (0.079 to 0.118 Kendall’s tau-b correlation), with further gains when more translations are added. Incorporating retrieved examples in COMET-polyic yields similar improvements (0.079 to 0.116 Kendall’s tau-b correlation). We release our models publicly.

Anthology ID:: 2025.wmt-1.63
Volume:: Proceedings of the Tenth Conference on Machine Translation
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:: WMT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 887–904
Language:
URL:: https://aclanthology.org/2025.wmt-1.63/
DOI:: 10.18653/v1/2025.wmt-1.63
Bibkey:
Cite (ACL):: Maike Züfle, Vilém Zouhar, Tu Anh Dinh, Felipe Maia Polo, Jan Niehues, and Mrinmaya Sachan. 2025. COMET-poly: Machine Translation Metric Grounded in Other Candidates. In Proceedings of the Tenth Conference on Machine Translation, pages 887–904, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: COMET-poly: Machine Translation Metric Grounded in Other Candidates (Züfle et al., WMT 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.wmt-1.63.pdf

PDF Cite Search Fix data