Apurba Paul


2023

pdf bib
Mytho-Annotator: An Annotation tool for Indian Hindu Mythology
Apurba Paul | Anupam Mondal | Sainik Mahata | Srijan Seal | Prasun Sarkar | Dipankar Das
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

Mythology is a collection of myths, especially one belonging to a particular religious or cultural tradition. We observed that an annotation tool is essential to identify important and complex information from any mythological texts or corpora. Additionally, obtaining highquality annotated corpora for complex information extraction including labeled text segments is an expensive and timeconsuming process. Hence, in this paper, we have designed and deployed an annotation tool for Hindu mythology which is presented as Mytho-Annotator. Its easy-to-use web-based text annotation tool is powered by Natural Language Processing (NLP). This tool primarily labels three different categories such as named entities, relationships, and event entities. This annotation tool offers a comprehensive and adaptable annotation paradigm.

2017

pdf bib
Identification of Character Adjectives from Mahabharata
Apurba Paul | Dipankar Das
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

The present paper describes the identification of prominent characters and their adjectives from Indian mythological epic, Mahabharata, written in English texts. However, in contrast to the tra-ditional approaches of named entity identifica-tion, the present system extracts hidden attributes associated with each of the characters (e.g., character adjectives). We observed distinct phrase level linguistic patterns that hint the pres-ence of characters in different text spans. Such six patterns were used in order to extract the cha-racters. On the other hand, a distinguishing set of novel features (e.g., multi-word expression, nodes and paths of parse tree, immediate ancestors etc.) was employed. Further, the correlation of the features is also measured in order to identify the important features. Finally, we applied various machine learning algorithms (e.g., Naive Bayes, KNN, Logistic Regression, Decision Tree, Random Forest etc.) along with deep learning to classify the patterns as characters or non-characters in order to achieve decent accuracy. Evaluation shows that phrase level linguistic patterns as well as the adopted features are highly active in capturing characters and their adjectives.

pdf bib
A Deep Dive into Identification of Characters from Mahabharata
Apurba Paul | Dipankar Das
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

2015

pdf bib
Identification and Classification of Emotional Key Phrases from Psychological Texts
Apurba Paul | Dipankar Das
Proceedings of the ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction