Anil Thakur


2005

pdf bib
Machine Translation of Bi-lingual Hindi-English (Hinglish) Text
R. Mahesh K. Sinha | Anil Thakur
Proceedings of Machine Translation Summit X: Papers

In the present communication-based society, no natural language seems to have been left untouched by the trends of code-mixing. For different communicative purposes, a language uses linguistic codes from other languages. This gives rise to a mixed language which is neither totally the host language nor the foreign language. The mixed language poses a new challenge to the problem of machine translation. It is necessary to identify the “foreign” elements in the source language and process them accordingly. The foreign elements may not appear in their original form and may get morphologically transformed as per the host language. Further, in a complex sentence, a clause/utterance may be in the host language while another clause/utterance may be in the foreign language. Code-mixing of Hindi and English where Hindi is the host language, is a common phenomenon in day-to-day language usage in Indian metropolis. The scenario is so common that people have started considering this a different variety altogether and calling it by the name Hinglish. In this paper, we present a mechanism for machine translation of Hinglish to pure (standard) Hindi and pure English forms.

pdf bib
Dealing with Replicative Words in Hindi for Machine Translation to English
R. Mahesh | K. Sinha | Anil Thakur
Proceedings of Machine Translation Summit X: Papers

The South Asian languages are well-known for their replicative words. In these languages, words of almost all the grammatical categories can occur in their reduplicative form. Hindi is one such language which is quite rich in having various types of replicative words in its lexicon. The traditional grammars and some of the research works have discussed the topic to some extent, particularly from the point of view of their descriptions and classifications. However, a detailed study of the topic becomes significant in view of the complexity involved in handling of such replicative words in the area of natural language processing, particularly for machine translation. In this paper, we discuss different types of replicative words in Hindi and their syntactic and semantic characteristics to formulate rules and strategies to identify their multiple functions and mapping patterns in English for machine translation from Hindi to English.

pdf bib
Divergence Patterns in Machine Translation between Hindi and English
R. Mahesh K. Sinha | Anil Thakur
Proceedings of Machine Translation Summit X: Posters

The issue of translation divergence is an important research topic in the area of machine translation. An exhaustive study of the divergence issues in MT is necessary for their proper classification and resolution. In the literature on MT, scholars have examined the issue and have proposed ways for their classification and resolution (Dorr 1993, 1994). However, the topic still needs further exploration to identify different sources of translation divergence in different pairs of translation languages. In this paper, we discuss translation patterns between Hindi and English of different types of constructions with a view to identifying the potential topics of the translation divergences. We take Dorr’s (1993, 1994) classification of translation divergence as the base to examine the different topics of translation divergence in Hindi and English. The primary goal of the paper is to point out different types of translation divergences in Hindi and English MT that have not been discussed in the existing literature.

pdf bib
Handling ki in Hindi for Hindi-English MT
R. Mahesh K. Sinha | Anil Thakur
Proceedings of Machine Translation Summit X: Posters

ki is an indeclinable element (particle) in Hindi which is used in multiple roles that have multiple mapping patterns in English. In one of its uses, ki functions as a clause complementizer and is mapped usually by that in declarative clauses and by various wh-words (such as what, why, where, how, etc.) in interrogative clauses. The contexts of these mappings are dependent on syntactic-semantic types of the clause. In its non-complementizer use, ki is used to denote various other functions such as coordinate conjunction, purpose and reason clause conjunction, yes-no question particle, etc. It is a difficult task to identify the different uses of ki and determine its multiple mapping patterns in the context of Hindi-English machine translation. A detailed linguistic analysis is needed to disambiguate the different contexts of ki in Hindi. In this paper, we examine the multiple uses and patterns of ki in Hindi and propose strategies for their identification and disambiguation for Hindi-English MT.

pdf bib
Translation divergence in English-Hindi MT
K. Sinha | R. Mahesh | Anil Thakur
Proceedings of the 10th EAMT Conference: Practical applications of machine translation