Machine Translation of Bi-lingual Hindi-English (Hinglish) Text

R. Mahesh K. Sinha, Anil Thakur


Abstract
In the present communication-based society, no natural language seems to have been left untouched by the trends of code-mixing. For different communicative purposes, a language uses linguistic codes from other languages. This gives rise to a mixed language which is neither totally the host language nor the foreign language. The mixed language poses a new challenge to the problem of machine translation. It is necessary to identify the “foreign” elements in the source language and process them accordingly. The foreign elements may not appear in their original form and may get morphologically transformed as per the host language. Further, in a complex sentence, a clause/utterance may be in the host language while another clause/utterance may be in the foreign language. Code-mixing of Hindi and English where Hindi is the host language, is a common phenomenon in day-to-day language usage in Indian metropolis. The scenario is so common that people have started considering this a different variety altogether and calling it by the name Hinglish. In this paper, we present a mechanism for machine translation of Hinglish to pure (standard) Hindi and pure English forms.
Anthology ID:
2005.mtsummit-papers.20
Volume:
Proceedings of Machine Translation Summit X: Papers
Month:
September 13-15
Year:
2005
Address:
Phuket, Thailand
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
149–156
Language:
URL:
https://aclanthology.org/2005.mtsummit-papers.20
DOI:
Bibkey:
Cite (ACL):
R. Mahesh K. Sinha and Anil Thakur. 2005. Machine Translation of Bi-lingual Hindi-English (Hinglish) Text. In Proceedings of Machine Translation Summit X: Papers, pages 149–156, Phuket, Thailand.
Cite (Informal):
Machine Translation of Bi-lingual Hindi-English (Hinglish) Text (Sinha & Thakur, MTSummit 2005)
Copy Citation:
PDF:
https://aclanthology.org/2005.mtsummit-papers.20.pdf