Exploring Pair-Wise NMT for Indian Languages

Kartheek Akella, Sai Himal Allu, Sridhar Suresh Ragupathi, Aman Singhal, Zeeshan Khan, C.v. Jawahar, Vinay P. Namboodiri


Abstract
In this paper, we address the task of improving pair-wise machine translation for specific low resource Indian languages. Multilingual NMT models have demonstrated a reasonable amount of effectiveness on resource-poor languages. In this work, we show that the performance of these models can be significantly improved upon by using back-translation through a filtered back-translation process and subsequent fine-tuning on the limited pair-wise language corpora. The analysis in this paper suggests that this method can significantly improve multilingual models’ performance over its baseline, yielding state-of-the-art results for various Indian languages.
Anthology ID:
2020.icon-main.59
Volume:
Proceedings of the 17th International Conference on Natural Language Processing (ICON)
Month:
December
Year:
2020
Address:
Indian Institute of Technology Patna, Patna, India
Editors:
Pushpak Bhattacharyya, Dipti Misra Sharma, Rajeev Sangal
Venue:
ICON
SIG:
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
437–443
Language:
URL:
https://aclanthology.org/2020.icon-main.59
DOI:
Bibkey:
Cite (ACL):
Kartheek Akella, Sai Himal Allu, Sridhar Suresh Ragupathi, Aman Singhal, Zeeshan Khan, C.v. Jawahar, and Vinay P. Namboodiri. 2020. Exploring Pair-Wise NMT for Indian Languages. In Proceedings of the 17th International Conference on Natural Language Processing (ICON), pages 437–443, Indian Institute of Technology Patna, Patna, India. NLP Association of India (NLPAI).
Cite (Informal):
Exploring Pair-Wise NMT for Indian Languages (Akella et al., ICON 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.icon-main.59.pdf