Ajit Kumar


2020

pdf bib
Punjabi to English Bidirectional NMT System
Kamal Deep | Ajit Kumar | Vishal Goyal
Proceedings of the 17th International Conference on Natural Language Processing (ICON): System Demonstrations

Machine Translation is ongoing research for last few decades. Today, Corpus-based Machine Translation systems are very popular. Statistical Machine Translation and Neural Machine Translation are based on the parallel corpus. In this research, the Punjabi to English Bidirectional Neural Machine Translation system is developed. To improve the accuracy of the Neural Machine Translation system, Word Embedding and Byte Pair Encoding is used. The claimed BLEU score is 38.30 for Punjabi to English Neural Machine Translation system and 36.96 for English to Punjabi Neural Machine Translation system.

pdf bib
Punjabi to Urdu Machine Translation System
Nitin Bansal | Ajit Kumar
Proceedings of the 17th International Conference on Natural Language Processing (ICON): System Demonstrations

Development of Machine Translation System (MTS) for any language pair is a challenging task for several reasons. Lack of lexical resources for any language is one of the major issue arise while developing MTS using that language. For example, during the development of Punjabi to Urdu MTS, many issues were recognized while preparing lexical resources for both the language. Since there is no machine readable dictionary is available for Punjabi to Urdu which can be directly used for translation; however various dictionaries are available to explain the meaning of word. Along with this, handling of OOV (out of vocabulary words), handling of multiple sense Punjabi word in Urdu, identification of proper nouns, identification of collocations in the source sentence i.e. Punjabi sentence in our case, are the issues which we are facing during development of this system. Since MTSs are in great demand from the last one decade and are being widely used in applications such as in case of smart phones. Therefore, development of such a system becomes more demanding and more users friendly. There usage is mainly in large scale translations, automated translations; act as an instrument to bridge a digital divide.