Multimodal Machine Translation with Embedding Prediction

Tosho Hirasawa, Hayahide Yamagishi, Yukio Matsumura, Mamoru Komachi


Abstract
Multimodal machine translation is an attractive application of neural machine translation (NMT). It helps computers to deeply understand visual objects and their relations with natural languages. However, multimodal NMT systems suffer from a shortage of available training data, resulting in poor performance for translating rare words. In NMT, pretrained word embeddings have been shown to improve NMT of low-resource domains, and a search-based approach is proposed to address the rare word problem. In this study, we effectively combine these two approaches in the context of multimodal NMT and explore how we can take full advantage of pretrained word embeddings to better translate rare words. We report overall performance improvements of 1.24 METEOR and 2.49 BLEU and achieve an improvement of 7.67 F-score for rare word translation.
Anthology ID:
N19-3012
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
86–91
Language:
URL:
https://aclanthology.org/N19-3012
DOI:
10.18653/v1/N19-3012
Bibkey:
Cite (ACL):
Tosho Hirasawa, Hayahide Yamagishi, Yukio Matsumura, and Mamoru Komachi. 2019. Multimodal Machine Translation with Embedding Prediction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 86–91, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Multimodal Machine Translation with Embedding Prediction (Hirasawa et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-3012.pdf
Presentation:
 N19-3012.Presentation.pdf
Video:
 https://vimeo.com/355800547
Code
 toshohirasawa/nmtpytorch-emb-pred