Machine Translation Reference-less Evaluation using YiSi-2 with Bilingual Mappings of Massive Multilingual Language Model

Chi-kiu Lo, Samuel Larkin


Abstract
We present a study on using YiSi-2 with massive multilingual pretrained language models for machine translation (MT) reference-less evaluation. Aiming at finding better semantic representation for semantic MT evaluation, we first test YiSi-2 with contextual embed- dings extracted from different layers of two different pretrained models, multilingual BERT and XLM-RoBERTa. We also experiment with learning bilingual mappings that trans- form the vector subspace of the source language to be closer to that of the target language in the pretrained model to obtain more accurate cross-lingual semantic similarity representations. Our results show that YiSi-2’s correlation with human direct assessment on translation quality is greatly improved by replacing multilingual BERT with XLM-RoBERTa and projecting the source embeddings into the tar- get embedding space using a cross-lingual lin- ear projection (CLP) matrix learnt from a small development set.
Anthology ID:
2020.wmt-1.100
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Editors:
Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
903–910
Language:
URL:
https://aclanthology.org/2020.wmt-1.100
DOI:
Bibkey:
Cite (ACL):
Chi-kiu Lo and Samuel Larkin. 2020. Machine Translation Reference-less Evaluation using YiSi-2 with Bilingual Mappings of Massive Multilingual Language Model. In Proceedings of the Fifth Conference on Machine Translation, pages 903–910, Online. Association for Computational Linguistics.
Cite (Informal):
Machine Translation Reference-less Evaluation using YiSi-2 with Bilingual Mappings of Massive Multilingual Language Model (Lo & Larkin, WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.100.pdf
Video:
 https://slideslive.com/38939653