Iterative Back Translation Revisited: An Experimental Investigation for Low-resource English Assamese Neural Machine Translation

Ahmed Mazida, Kashyap Kishore, Talukdar Kuwali, Boruah Parvez


Abstract
Back Translation has been an effective strategy to leverage monolingual data both on the source and target sides. Research have opened up several ways to improvise the procedure, one among them is iterative back translation where the monolingual data is repeatedly translated and used for re-training for the model enhancement. Despite its success, iterative back translation remains relatively unexplored in low-resource scenarios, particularly for rich Indic languages. This paper presents a comprehensive investigation into the application of iterative back translation to the low-resource English-Assamese language pair. A simplified version of iterative back translation is presented. This study explores various critical aspects associated with back translation, including the balance between original and synthetic data and the refinement of the target (backward) model through cleaner data retraining. The experimental results demonstrate significant improvements in translation quality. Specifically, the simplistic approach to iterative back translation yields a noteworthy +6.38 BLEU score improvement for the EnglishAssamese translation direction and a +4.38 BLEU score improvement for the AssameseEnglish translation direction. Further enhancements are further noticed when incorporating higher-quality, cleaner data for model retraining highlighting the potential of iterative back translation as a valuable tool for enhancing low-resource neural machine translation (NMT).
Anthology ID:
2023.icon-1.17
Volume:
Proceedings of the 20th International Conference on Natural Language Processing (ICON)
Month:
December
Year:
2023
Address:
Goa University, Goa, India
Editors:
D. Pawar Jyoti, Lalitha Devi Sobha
Venue:
ICON
SIG:
SIGLEX
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
172–179
Language:
URL:
https://aclanthology.org/2023.icon-1.17
DOI:
Bibkey:
Cite (ACL):
Ahmed Mazida, Kashyap Kishore, Talukdar Kuwali, and Boruah Parvez. 2023. Iterative Back Translation Revisited: An Experimental Investigation for Low-resource English Assamese Neural Machine Translation. In Proceedings of the 20th International Conference on Natural Language Processing (ICON), pages 172–179, Goa University, Goa, India. NLP Association of India (NLPAI).
Cite (Informal):
Iterative Back Translation Revisited: An Experimental Investigation for Low-resource English Assamese Neural Machine Translation (Mazida et al., ICON 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.icon-1.17.pdf