Estimation vs Metrics: is QE Useful for MT Model Selection?

Anna Zaretskaya, José Conceição, Frederick Bane


Abstract
This paper presents a case study of applying machine translation quality estimation (QE) for the purpose of machine translation (MT) engine selection. The goal is to understand how well the QE predictions correlate with several MT evaluation metrics (automatic and human). Our findings show that our industry-level QE system is not reliable enough for MT selection when the MT systems have similar performance. We suggest that QE can be used with more success for other tasks relevant for translation industry such as risk prevention.
Anthology ID:
2020.eamt-1.36
Volume:
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
Month:
November
Year:
2020
Address:
Lisboa, Portugal
Editors:
André Martins, Helena Moniz, Sara Fumega, Bruno Martins, Fernando Batista, Luisa Coheur, Carla Parra, Isabel Trancoso, Marco Turchi, Arianna Bisazza, Joss Moorkens, Ana Guerberof, Mary Nurminen, Lena Marg, Mikel L. Forcada
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
339–346
Language:
URL:
https://aclanthology.org/2020.eamt-1.36
DOI:
Bibkey:
Cite (ACL):
Anna Zaretskaya, José Conceição, and Frederick Bane. 2020. Estimation vs Metrics: is QE Useful for MT Model Selection?. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 339–346, Lisboa, Portugal. European Association for Machine Translation.
Cite (Informal):
Estimation vs Metrics: is QE Useful for MT Model Selection? (Zaretskaya et al., EAMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.eamt-1.36.pdf