Vinay P. Namboodiri

Also published as: Vinay P. Namboodiri


2020

pdf bib
A Multilingual Parallel Corpora Collection Effort for Indian Languages
Shashank Siripragada | Jerin Philip | Vinay P. Namboodiri | C V Jawahar
Proceedings of the Twelfth Language Resources and Evaluation Conference

We present sentence aligned parallel corpora across 10 Indian Languages - Hindi, Telugu, Tamil, Malayalam, Gujarati, Urdu, Bengali, Oriya, Marathi, Punjabi, and English - many of which are categorized as low resource. The corpora are compiled from online sources which have content shared across languages. The corpora presented significantly extends present resources that are either not large enough or are restricted to a specific domain (such as health). We also provide a separate test corpus compiled from an independent online source that can be independently used for validating the performance in 10 Indian languages. Alongside, we report on the methods of constructing such corpora using tools enabled by recent advances in machine translation and cross-lingual retrieval using deep neural network based methods.

pdf bib
Exploring Pair-Wise NMT for Indian Languages
Kartheek Akella | Sai Himal Allu | Sridhar Suresh Ragupathi | Aman Singhal | Zeeshan Khan | C.v. Jawahar | Vinay P. Namboodiri
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

In this paper, we address the task of improving pair-wise machine translation for specific low resource Indian languages. Multilingual NMT models have demonstrated a reasonable amount of effectiveness on resource-poor languages. In this work, we show that the performance of these models can be significantly improved upon by using back-translation through a filtered back-translation process and subsequent fine-tuning on the limited pair-wise language corpora. The analysis in this paper suggests that this method can significantly improve multilingual models’ performance over its baseline, yielding state-of-the-art results for various Indian languages.

2018

pdf bib
CVIT-MT Systems for WAT-2018
Jerin Philip | Vinay P. Namboodiri | C.V. Jawahar
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation