Hinglish to English Machine Translation using Multilingual Transformers

Vibhav Agarwal, Pooja Rao, Dinesh Babu Jayagopi


Abstract
Code-Mixed language plays a very important role in communication in multilingual societies and with the recent increase in internet users especially in multilingual societies, the usage of such mixed language has also increased. However, the cross translation be- tween the Hinglish Code-Mixed and English and vice-versa has not been explored very extensively. With the recent success of large pretrained language models, we explore the possibility of using multilingual pretrained transformers like mBART and mT5 for exploring one such task of code-mixed Hinglish to English machine translation. Further, we compare our approach with the only baseline over the PHINC dataset and report a significant jump from 15.3 to 29.5 in BLEU scores, a 92.8% improvement over the same dataset.
Anthology ID:
2021.ranlp-srw.3
Volume:
Proceedings of the Student Research Workshop Associated with RANLP 2021
Month:
September
Year:
2021
Address:
Online
Editors:
Souhila Djabri, Dinara Gimadi, Tsvetomila Mihaylova, Ivelina Nikolova-Koleva
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
16–21
Language:
URL:
https://aclanthology.org/2021.ranlp-srw.3
DOI:
Bibkey:
Cite (ACL):
Vibhav Agarwal, Pooja Rao, and Dinesh Babu Jayagopi. 2021. Hinglish to English Machine Translation using Multilingual Transformers. In Proceedings of the Student Research Workshop Associated with RANLP 2021, pages 16–21, Online. INCOMA Ltd..
Cite (Informal):
Hinglish to English Machine Translation using Multilingual Transformers (Agarwal et al., RANLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ranlp-srw.3.pdf
Data
PHINC