Introducing EM-FT for Manipuri-English Neural Machine Translation

Rudali Huidrom, Yves Lepage


Abstract
This paper introduces a pretrained word embedding for Manipuri, a low-resourced Indian language. The pretrained word embedding based on FastText is capable of handling the highly agglutinating language Manipuri (mni). We then perform machine translation (MT) experiments using neural network (NN) models. In this paper, we confirm the following observations. Firstly, the reported BLEU score of the Transformer architecture with FastText word embedding model EM-FT performs better than without in all the NMT experiments. Secondly, we observe that adding more training data from a different domain of the test data negatively impacts translation accuracy. The resources reported in this paper are made available in the ELRA catalogue to help the low-resourced languages community with MT/NLP tasks.
Anthology ID:
2022.wildre-1.1
Volume:
Proceedings of the WILDRE-6 Workshop within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Girish Nath Jha, Sobha L., Kalika Bali, Atul Kr. Ojha
Venue:
WILDRE
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1–6
Language:
URL:
https://aclanthology.org/2022.wildre-1.1
DOI:
Bibkey:
Cite (ACL):
Rudali Huidrom and Yves Lepage. 2022. Introducing EM-FT for Manipuri-English Neural Machine Translation. In Proceedings of the WILDRE-6 Workshop within the 13th Language Resources and Evaluation Conference, pages 1–6, Marseille, France. European Language Resources Association.
Cite (Informal):
Introducing EM-FT for Manipuri-English Neural Machine Translation (Huidrom & Lepage, WILDRE 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.wildre-1.1.pdf