Ensemble Fine-tuned mBERT for Translation Quality Estimation

Shaika Chowdhury, Naouel Baili, Brian Vannah


Abstract
Quality Estimation (QE) is an important component of the machine translation workflow as it assesses the quality of the translated output without consulting reference translations. In this paper, we discuss our submission to the WMT 2021 QE Shared Task. We participate in Task 2 sentence-level sub-task that challenge participants to predict the HTER score for sentence-level post-editing effort. Our proposed system is an ensemble of multilingual BERT (mBERT)-based regression models, which are generated by fine-tuning on different input settings. It demonstrates comparable performance with respect to the Pearson’s correlation, and beat the baseline system in MAE/ RMSE for several language pairs. In addition, we adapt our system for the zero-shot setting by exploiting target language-relevant language pairs and pseudo-reference translations.
Anthology ID:
2021.wmt-1.93
Volume:
Proceedings of the Sixth Conference on Machine Translation
Month:
November
Year:
2021
Address:
Online
Editors:
Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
897–903
Language:
URL:
https://aclanthology.org/2021.wmt-1.93
DOI:
Bibkey:
Cite (ACL):
Shaika Chowdhury, Naouel Baili, and Brian Vannah. 2021. Ensemble Fine-tuned mBERT for Translation Quality Estimation. In Proceedings of the Sixth Conference on Machine Translation, pages 897–903, Online. Association for Computational Linguistics.
Cite (Informal):
Ensemble Fine-tuned mBERT for Translation Quality Estimation (Chowdhury et al., WMT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.wmt-1.93.pdf