JUNLP@ICON2020: Low Resourced Machine Translation for Indic Languages

Sainik Mahata, Dipankar Das, Sivaji Bandyopadhyay


Abstract
In the current work, we present the description of the systems submitted to a machine translation shared task organized by ICON 2020: 17th International Conference on Natural Language Processing. The systems were developed to show the capability of general domain machine translation when translating into Indic languages, English-Hindi, in our case. The paper shows the training process and quantifies the performance of two state-of-the-art translation systems, viz., Statistical Machine Translation and Neural Machine Translation. While Statistical Machine Translation systems work better in a low-resource setting, Neural Machine Translation systems are able to generate sentences that are fluent in nature. Since both these systems have contrasting advantages, a hybrid system, incorporating both, was also developed to leverage all the strong points. The submitted systems garnered BLEU scores of 8.701943312, 0.6361336198, and 11.78873307 respectively and the scores of the hybrid system helped us to the fourth spot in the competition leaderboard.
Anthology ID:
2020.icon-adapmt.1
Volume:
Proceedings of the 17th International Conference on Natural Language Processing (ICON): Adap-MT 2020 Shared Task
Month:
December
Year:
2020
Address:
Patna, India
Venue:
ICON
SIG:
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
1–5
Language:
URL:
https://aclanthology.org/2020.icon-adapmt.1
DOI:
Bibkey:
Cite (ACL):
Sainik Mahata, Dipankar Das, and Sivaji Bandyopadhyay. 2020. JUNLP@ICON2020: Low Resourced Machine Translation for Indic Languages. In Proceedings of the 17th International Conference on Natural Language Processing (ICON): Adap-MT 2020 Shared Task, pages 1–5, Patna, India. NLP Association of India (NLPAI).
Cite (Informal):
JUNLP@ICON2020: Low Resourced Machine Translation for Indic Languages (Mahata et al., ICON 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.icon-adapmt.1.pdf