Is Preference Alignment Always the Best Option to Enhance LLM-Based Translation? An Empirical Analysis

Hippolyte Gisserot-Boukhlef, Ricardo Rei, Emmanuel Malherbe, Céline Hudelot, Pierre Colombo, Nuno M. Guerreiro


Abstract
Neural metrics for machine translation (MT) evaluation have become increasingly prominent due to their superior correlation with human judgments compared to traditional lexical metrics. Researchers have therefore utilized neural metrics through quality-informed decoding strategies, achieving better results than likelihood-based methods. With the rise of Large Language Models (LLMs), preference-based alignment techniques have gained attention for their potential to enhance translation quality by optimizing model weights directly on preferences induced by quality estimators. This study focuses on Contrastive Preference Optimization (CPO) and conducts extensive experiments to evaluate the impact of preference-based alignment on translation quality. Our findings indicate that while CPO consistently outperforms Supervised Fine-Tuning (SFT) on high-quality data with regard to the alignment metric, it may lead to instability across downstream evaluation metrics, particularly between neural and lexical ones. Additionally, we demonstrate that relying solely on the base model for generating candidate translations achieves performance comparable to using multiple external systems, while ensuring better consistency across downstream metrics.
Anthology ID:
2024.wmt-1.127
Volume:
Proceedings of the Ninth Conference on Machine Translation
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1373–1392
Language:
URL:
https://aclanthology.org/2024.wmt-1.127
DOI:
Bibkey:
Cite (ACL):
Hippolyte Gisserot-Boukhlef, Ricardo Rei, Emmanuel Malherbe, Céline Hudelot, Pierre Colombo, and Nuno M. Guerreiro. 2024. Is Preference Alignment Always the Best Option to Enhance LLM-Based Translation? An Empirical Analysis. In Proceedings of the Ninth Conference on Machine Translation, pages 1373–1392, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Is Preference Alignment Always the Best Option to Enhance LLM-Based Translation? An Empirical Analysis (Gisserot-Boukhlef et al., WMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.wmt-1.127.pdf