TMEKU System for the WAT2021 Multimodal Translation Task

Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu


Abstract
We introduce our TMEKU system submitted to the English-Japanese Multimodal Translation Task for WAT 2021. We participated in the Flickr30kEnt-JP task and Ambiguous MSCOCO Multimodal task under the constrained condition using only the officially provided datasets. Our proposed system employs soft alignment of word-region for multimodal neural machine translation (MNMT). The experimental results evaluated on the BLEU metric provided by the WAT 2021 evaluation site show that the TMEKU system has achieved the best performance among all the participated systems. Further analysis of the case study demonstrates that leveraging word-region alignment between the textual and visual modalities is the key to performance enhancement in our TMEKU system, which leads to better visual information use.
Anthology ID:
2021.wat-1.20
Volume:
Proceedings of the 8th Workshop on Asian Translation (WAT2021)
Month:
August
Year:
2021
Address:
Online
Editors:
Toshiaki Nakazawa, Hideki Nakayama, Isao Goto, Hideya Mino, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Shohei Higashiyama, Hiroshi Manabe, Win Pa Pa, Shantipriya Parida, Ondřej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Katsuhito Sudoh, Sadao Kurohashi, Pushpak Bhattacharyya
Venue:
WAT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
174–180
Language:
URL:
https://aclanthology.org/2021.wat-1.20
DOI:
10.18653/v1/2021.wat-1.20
Bibkey:
Cite (ACL):
Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, and Chenhui Chu. 2021. TMEKU System for the WAT2021 Multimodal Translation Task. In Proceedings of the 8th Workshop on Asian Translation (WAT2021), pages 174–180, Online. Association for Computational Linguistics.
Cite (Informal):
TMEKU System for the WAT2021 Multimodal Translation Task (Zhao et al., WAT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.wat-1.20.pdf
Data
Flickr30K EntitiesFlickr30k