Neural Machine Translation for English-Tamil

Himanshu Choudhary, Aditya Kumar Pathak, Rajiv Ratan Saha, Ponnurangam Kumaraguru


Abstract
A huge amount of valuable resources is available on the web in English, which are often translated into local languages to facilitate knowledge sharing among local people who are not much familiar with English. However, translating such content manually is very tedious, costly, and time-consuming process. To this end, machine translation is an efficient approach to translate text without any human involvement. Neural machine translation (NMT) is one of the most recent and effective translation technique amongst all existing machine translation systems. In this paper, we apply NMT for English-Tamil language pair. We propose a novel neural machine translation technique using word-embedding along with Byte-Pair-Encoding (BPE) to develop an efficient translation system that overcomes the OOV (Out Of Vocabulary) problem for languages which do not have much translations available online. We use the BLEU score for evaluating the system performance. Experimental results confirm that our proposed MIDAS translator (8.33 BLEU score) outperforms Google translator (3.75 BLEU score).
Anthology ID:
W18-6459
Volume:
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Month:
October
Year:
2018
Address:
Belgium, Brussels
Venues:
EMNLP | WMT | WS
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
770–775
Language:
URL:
https://aclanthology.org/W18-6459
DOI:
10.18653/v1/W18-6459
Bibkey:
Cite (ACL):
Himanshu Choudhary, Aditya Kumar Pathak, Rajiv Ratan Saha, and Ponnurangam Kumaraguru. 2018. Neural Machine Translation for English-Tamil. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 770–775, Belgium, Brussels. Association for Computational Linguistics.
Cite (Informal):
Neural Machine Translation for English-Tamil (Choudhary et al., 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-6459.pdf