Ranbir Sanasam


2023

pdf bib
IndiSocialFT: Multilingual Word Representation for Indian languages in code-mixed environment
Saurabh Kumar | Ranbir Sanasam | Sukumar Nandi
Findings of the Association for Computational Linguistics: EMNLP 2023

The increasing number of Indian language users on the internet necessitates the development of Indian language technologies. In response to this demand, our paper presents a generalized representation vector for diverse text characteristics, including native scripts, transliterated text, multilingual, code-mixed, and social media-related attributes. We gather text from both social media and well-formed sources and utilize the FastText model to create the “IndiSocialFT” embedding. Through intrinsic and extrinsic evaluation methods, we compare IndiSocialFT with three popular pretrained embeddings trained over Indian languages. Our findings show that the proposed embedding surpasses the baselines in most cases and languages, demonstrating its suitability for various NLP applications.

2022

pdf bib
TEAM: A multitask learning based Taxonomy Expansion approach for Attach and Merge
Bornali Phukon | Anasua Mitra | Ranbir Sanasam | Priyankoo Sarmah
Findings of the Association for Computational Linguistics: NAACL 2022

Taxonomy expansion is a crucial task. Most of Automatic expansion of taxonomy are of two types, attach and merge. In a taxonomy like WordNet, both merge and attach are integral parts of the expansion operations but majority of study consider them separately. This paper proposes a novel mult-task learning-based deep learning method known as Taxonomy Expansion with Attach and Merge (TEAM) that performs both the merge and attach operations. To the best of our knowledge this is the first study which integrates both merge and attach operations in a single model. The proposed models have been evaluated on three separate WordNet taxonomies, viz., Assamese, Bangla, and Hindi. From the various experimental setups, it is shown that TEAM outperforms its state-of-the-art counterparts for attach operation, and also provides highly encouraging performance for the merge operation.