Atanu Mandal


2023

pdf bib
IACS-LRILT: Machine Translation for Low-Resource Indic Languages
Dhairya Suman | Atanu Mandal | Santanu Pal | Sudip Naskar
Proceedings of the Eighth Conference on Machine Translation

Even though, machine translation has seen huge improvements in the the last decade, translation quality for Indic languages is still underwhelming, which is attributed to the small amount of parallel data available. In this paper, we present our approach to mitigate the issue of the low amount of parallel training data availability for Indic languages, especially for the language pair English-Manipuri and Assamese-English. Our primary submission for the Manipuri-to-English translation task provided the best scoring system for this language direction. We describe about the systems we built in detail and our findings in the process.

2021

pdf bib
JUNLP@DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Langauges
Avishek Garain | Atanu Mandal | Sudip Kumar Naskar
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages

Offensive language identification has been an active area of research in natural language processing. With the emergence of multiple social media platforms offensive language identification has emerged as a need of the hour. Traditional offensive language identification models fail to deliver acceptable results as social media contents are largely in multilingual and are code-mixed in nature. This paper tries to resolve this problem by using IndicBERT and BERT architectures, to facilitate identification of offensive languages for Kannada-English, Malayalam-English, and Tamil-English code-mixed language pairs extracted from social media. The presented approach when evaluated on the test corpus provided precision, recall, and F1 score for language pair Kannada-English as 0.62, 0.71, and 0.66, respectively, for language pair Malayalam-English as 0.77, 0.43, and 0.53, respectively, and for Tamil-English as 0.71, 0.74, and 0.72, respectively.