A Filtering Approach to Object Region Detection in Multimodal Machine Translation

Ali Hatami, Paul Buitelaar, Mihael Arcan


Abstract
Recent studies in Multimodal Machine Translation (MMT) have explored the use of visual information in a multimodal setting to analyze its redundancy with textual information. The aim of this work is to develop a more effective approach to incorporating relevant visual information into the translation process and improve the overall performance of MMT models. This paper proposes an object-level filtering approach in Multimodal Machine Translation, where the approach is applied to object regions extracted from an image to filter out irrelevant objects based on the image captions to be translated. Using the filtered image helps the model to consider only relevant objects and their relative locations to each other. Different matching methods, including string matching and word embeddings, are employed to identify relevant objects. Gaussian blurring is used to soften irrelevant objects from the image and to evaluate the effect of object filtering on translation quality. The performance of the filtering approaches was evaluated on the Multi30K dataset in English to German, French, and Czech translations, based on BLEU, ChrF2, and TER metrics.
Anthology ID:
2023.mtsummit-research.33
Volume:
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track
Month:
September
Year:
2023
Address:
Macau SAR, China
Editors:
Masao Utiyama, Rui Wang
Venue:
MTSummit
SIG:
Publisher:
Asia-Pacific Association for Machine Translation
Note:
Pages:
393–405
Language:
URL:
https://aclanthology.org/2023.mtsummit-research.33
DOI:
Bibkey:
Cite (ACL):
Ali Hatami, Paul Buitelaar, and Mihael Arcan. 2023. A Filtering Approach to Object Region Detection in Multimodal Machine Translation. In Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track, pages 393–405, Macau SAR, China. Asia-Pacific Association for Machine Translation.
Cite (Informal):
A Filtering Approach to Object Region Detection in Multimodal Machine Translation (Hatami et al., MTSummit 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.mtsummit-research.33.pdf