MUCS@ - Machine Translation for Dravidian Languages using Stacked Long Short Term Memory

Asha Hegde, Ibrahim Gashaw, Shashirekha H.l.


Abstract
Dravidian language family is one of the largest language families in the world. In spite of its uniqueness, Dravidian languages have gained very less attention due to scarcity of resources to conduct language technology tasks such as translation, Parts-of-Speech tagging, Word Sense Disambiguation etc,. In this paper, we, team MUCS, describe sequence-to-sequence stacked Long Short Term Memory (LSTM) based Neural Machine Translation (NMT) model submitted to “Machine Translation in Dravidian languages”, a shared task organized by EACL-2021. The NMT model was applied on translation using English-Tamil, EnglishTelugu, English-Malayalam and Tamil-Telugu corpora provided by the organizers. Standard evaluation metrics namely Bilingual Evaluation Understudy (BLEU) and human evaluations are used to evaluate the model. Our models exhibited good accuracy for all the language pairs and obtained 2nd rank for TamilTelugu language pair.
Anthology ID:
2021.dravidianlangtech-1.50
Volume:
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Month:
April
Year:
2021
Address:
Kyiv
Venue:
DravidianLangTech
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
340–345
Language:
URL:
https://aclanthology.org/2021.dravidianlangtech-1.50
DOI:
Bibkey:
Cite (ACL):
Asha Hegde, Ibrahim Gashaw, and Shashirekha H.l.. 2021. MUCS@ - Machine Translation for Dravidian Languages using Stacked Long Short Term Memory. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 340–345, Kyiv. Association for Computational Linguistics.
Cite (Informal):
MUCS@ - Machine Translation for Dravidian Languages using Stacked Long Short Term Memory (Hegde et al., DravidianLangTech 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.dravidianlangtech-1.50.pdf
Software:
 2021.dravidianlangtech-1.50.Software.zip
Dataset:
 2021.dravidianlangtech-1.50.Dataset.zip