Chasing COMET: Leveraging Minimum Bayes Risk Decoding for Self-Improving Machine Translation

Kamil Guttmann, Mikołaj Pokrywka, Adrian Charkiewicz, Artur Nowakowski


Abstract
This paper explores Minimum Bayes Risk (MBR) decoding for self-improvement in machine translation (MT), particularly for domain adaptation and low-resource languages. We implement the self-improvement process by fine-tuning the model on its MBR-decoded forward translations. By employing COMET as the MBR utility metric, we aim to achieve the reranking of translations that better aligns with human preferences. The paper explores the iterative application of this approach and the potential need for language-specific MBR utility metrics. The results demonstrate significant enhancements in translation quality for all examined language pairs, including successful application to domain-adapted models and generalisation to low-resource settings. This highlights the potential of COMET-guided MBR for efficient MT self-improvement in various scenarios.
Anthology ID:
2024.eamt-1.11
Volume:
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)
Month:
June
Year:
2024
Address:
Sheffield, UK
Editors:
Carolina Scarton, Charlotte Prescott, Chris Bayliss, Chris Oakley, Joanna Wright, Stuart Wrigley, Xingyi Song, Edward Gow-Smith, Rachel Bawden, Víctor M Sánchez-Cartagena, Patrick Cadwell, Ekaterina Lapshinova-Koltunski, Vera Cabarrão, Konstantinos Chatzitheodorou, Mary Nurminen, Diptesh Kanojia, Helena Moniz
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation (EAMT)
Note:
Pages:
80–99
Language:
URL:
https://aclanthology.org/2024.eamt-1.11
DOI:
Bibkey:
Cite (ACL):
Kamil Guttmann, Mikołaj Pokrywka, Adrian Charkiewicz, and Artur Nowakowski. 2024. Chasing COMET: Leveraging Minimum Bayes Risk Decoding for Self-Improving Machine Translation. In Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), pages 80–99, Sheffield, UK. European Association for Machine Translation (EAMT).
Cite (Informal):
Chasing COMET: Leveraging Minimum Bayes Risk Decoding for Self-Improving Machine Translation (Guttmann et al., EAMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eamt-1.11.pdf