Modifications of Machine Translation Evaluation Metrics by Using Word Embeddings

Haozhou Wang, Paola Merlo


Abstract
Traditional machine translation evaluation metrics such as BLEU and WER have been widely used, but these metrics have poor correlations with human judgements because they badly represent word similarity and impose strict identity matching. In this paper, we propose some modifications to the traditional measures based on word embeddings for these two metrics. The evaluation results show that our modifications significantly improve their correlation with human judgements.
Anthology ID:
W16-4505
Volume:
Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Patrik Lambert, Bogdan Babych, Kurt Eberle, Rafael E. Banchs, Reinhard Rapp, Marta R. Costa-jussà
Venue:
HyTra
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
33–41
Language:
URL:
https://aclanthology.org/W16-4505
DOI:
Bibkey:
Cite (ACL):
Haozhou Wang and Paola Merlo. 2016. Modifications of Machine Translation Evaluation Metrics by Using Word Embeddings. In Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6), pages 33–41, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Modifications of Machine Translation Evaluation Metrics by Using Word Embeddings (Wang & Merlo, HyTra 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-4505.pdf