Pranav Gaikwad
2023
Machine Translation Advancements for Low-Resource Indian Languages in WMT23: CFILT-IITB’s Effort for Bridging the Gap
Pranav Gaikwad
|
Meet Doshi
|
Sourabh Deoghare
|
Pushpak Bhattacharyya
Proceedings of the Eighth Conference on Machine Translation
This paper is related to the submission of the CFILT-IITB team for the task called IndicMT in WMT23. The paper describes our MT systems submitted to the WMT23 IndicMT shared task. The task focused on MT system development from/to English and four low-resource North-East Indian languages, viz., Assamese, Khasi, Manipuri, and Mizo. We trained them on a small parallel corpus resulting in poor-quality systems. Therefore, we utilize transfer learning with the help of a large pre-trained multilingual NMT system. Since this approach produced the best results, we submitted our NMT models for the shared task using this approach.