Don’t Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation

Giorgos Vernikos, Andrei Popescu-Belis


Abstract
Neural machine translation systems estimate probabilities of target sentences given source sentences, yet these estimates may not align with human preferences. This work introduces QE-fusion, a method that synthesizes translations using a quality estimation metric (QE), which correlates better with human judgments. QE-fusion leverages a pool of candidates sampled from a model, combining spans from different candidates using a QE metric such as CometKiwi. We compare QE-fusion against beam search and recent reranking techniques, such as Minimum Bayes Risk decoding or QE-reranking. Our method consistently improves translation quality in terms of COMET and BLEURT scores when applied to large language models (LLMs) used for translation (PolyLM, XGLM, Llama2, Mistral, ALMA, and Tower) and to multilingual translation models (NLLB), over five language pairs. Notably, QE-fusion exhibits larger improvements for LLMs due to their ability to generate diverse outputs. We demonstrate that our approach generates novel translations in over half of the cases and consistently outperforms other methods across varying numbers of candidates (5–200). Furthermore, we empirically establish that QE-fusion scales linearly with the number of candidates in the pool.
Anthology ID:
2024.acl-long.653
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12087–12105
Language:
URL:
https://aclanthology.org/2024.acl-long.653
DOI:
Bibkey:
Cite (ACL):
Giorgos Vernikos and Andrei Popescu-Belis. 2024. Don’t Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12087–12105, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Don’t Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation (Vernikos & Popescu-Belis, ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-long.653.pdf