Bias Mitigation in Machine Translation Quality Estimation

Hanna Behnke, Marina Fomicheva, Lucia Specia


Abstract
Machine Translation Quality Estimation (QE) aims to build predictive models to assess the quality of machine-generated translations in the absence of reference translations. While state-of-the-art QE models have been shown to achieve good results, they over-rely on features that do not have a causal impact on the quality of a translation. In particular, there appears to be a partial input bias, i.e., a tendency to assign high-quality scores to translations that are fluent and grammatically correct, even though they do not preserve the meaning of the source. We analyse the partial input bias in further detail and evaluate four approaches to use auxiliary tasks for bias mitigation. Two approaches use additional data to inform and support the main task, while the other two are adversarial, actively discouraging the model from learning the bias. We compare the methods with respect to their ability to reduce the partial input bias while maintaining the overall performance. We find that training a multitask architecture with an auxiliary binary classification task that utilises additional augmented data best achieves the desired effects and generalises well to different languages and quality metrics.
Anthology ID:
2022.acl-long.104
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1475–1487
Language:
URL:
https://aclanthology.org/2022.acl-long.104
DOI:
10.18653/v1/2022.acl-long.104
Bibkey:
Cite (ACL):
Hanna Behnke, Marina Fomicheva, and Lucia Specia. 2022. Bias Mitigation in Machine Translation Quality Estimation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1475–1487, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Bias Mitigation in Machine Translation Quality Estimation (Behnke et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.104.pdf
Software:
 2022.acl-long.104.software.zip
Code
 agesb/transquest
Data
MLQE-PEWikiMatrix