BITS-P at WAT 2023: Improving Indic Language Multimodal Translation by Image Augmentation using Diffusion Models

Amulya Dash, Hrithik Raj Gupta, Yashvardhan Sharma


Abstract
This paper describes the proposed system for mutlimodal machine translation. We have participated in multimodal translation tasks for English into three Indic languages: Hindi, Bengali, and Malayalam. We leverage the inherent richness of multimodal data to bridge the gap of ambiguity in translation. We fine-tuned the ‘No Language Left Behind’ (NLLB) machine translation model for multimodal translation, further enhancing the model accuracy by image data augmentation using latent diffusion. Our submission achieves the best BLEU score for English-Hindi, English-Bengali, and English-Malayalam language pairs for both Evaluation and Challenge test sets.
Anthology ID:
2023.wat-1.3
Volume:
Proceedings of the 10th Workshop on Asian Translation
Month:
September
Year:
2023
Address:
Macau SAR, China
Editors:
Toshiaki Nakazawa, Kazutaka Kinugawa, Hideya Mino, Isao Goto, Raj Dabre, Shohei Higashiyama, Shantipriya Parida, Makoto Morishita, Ondrej Bojar, Akiko Eriguchi, Yusuke Oda, Akiko Eriguchi, Chenhui Chu, Sadao Kurohashi
Venue:
WAT
SIG:
Publisher:
Asia-Pacific Association for Machine Translation
Note:
Pages:
41–45
Language:
URL:
https://aclanthology.org/2023.wat-1.3
DOI:
Bibkey:
Cite (ACL):
Amulya Dash, Hrithik Raj Gupta, and Yashvardhan Sharma. 2023. BITS-P at WAT 2023: Improving Indic Language Multimodal Translation by Image Augmentation using Diffusion Models. In Proceedings of the 10th Workshop on Asian Translation, pages 41–45, Macau SAR, China. Asia-Pacific Association for Machine Translation.
Cite (Informal):
BITS-P at WAT 2023: Improving Indic Language Multimodal Translation by Image Augmentation using Diffusion Models (Dash et al., WAT 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.wat-1.3.pdf