Hrishikesh Terdalkar


2023

pdf bib
Antarlekhaka: A Comprehensive Tool for Multi-task Natural Language Annotation
Hrishikesh Terdalkar | Arnab Bhattacharya
Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)

One of the primary obstacles in the advancement of Natural Language Processing (NLP) technologies for low-resource languages is the lack of annotated datasets for training and testing machine learning models. In this paper, we present Antarlekhaka, a tool for manual annotation of a comprehensive set of tasks relevant to NLP. The tool is Unicode-compatible, language-agnostic, Web-deployable and supports distributed annotation by multiple simultaneous annotators. The system sports user-friendly interfaces for 8 categories of annotation tasks. These, in turn, enable the annotation of a considerably larger set of NLP tasks. The task categories include two linguistic tasks not handled by any other tool, namely, sentence boundary detection and deciding canonical word order, which are important tasks for text that is in the form of poetry. We propose the idea of sequential annotation based on small text units, where an annotator performs several tasks related to a single text unit before proceeding to the next unit. The research applications of the proposed mode of multi-task annotation are also discussed. Antarlekhaka outperforms other annotation tools in objective evaluation. It has been also used for two real-life annotation tasks on two different languages, namely, Sanskrit and Bengali. The tool is available at https://github.com/Antarlekhaka/code

pdf bib
Chandojnanam: A Sanskrit Meter Identification and Utilization System
Hrishikesh Terdalkar | Arnab Bhattacharya
Proceedings of the Computational Sanskrit & Digital Humanities: Selected papers presented at the 18th World Sanskrit Conference

pdf bib
Semantic Annotation and Querying Framework based on Semi-structured Ayurvedic Text
Hrishikesh Terdalkar | Arnab Bhattacharya | Madhulika Dubey | S Ramamurthy | Bhavna Naneria Singh
Proceedings of the Computational Sanskrit & Digital Humanities: Selected papers presented at the 18th World Sanskrit Conference

2022

pdf bib
A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in Sanskrit
Jivnesh Sandhan | Ashish Gupta | Hrishikesh Terdalkar | Tushar Sandhan | Suvendu Samanta | Laxmidhar Behera | Pawan Goyal
Proceedings of the 29th International Conference on Computational Linguistics

The phenomenon of compounding is ubiquitous in Sanskrit. It serves for achieving brevity in expressing thoughts, while simultaneously enriching the lexical and structural formation of the language. In this work, we focus on the Sanskrit Compound Type Identification (SaCTI) task, where we consider the problem of identifying semantic relations between the components of a compound word. Earlier approaches solely rely on the lexical information obtained from the components and ignore the most crucial contextual and syntactic information useful for SaCTI. However, the SaCTI task is challenging primarily due to the implicitly encoded context-sensitive semantic relation between the compound components. Thus, we propose a novel multi-task learning architecture which incorporates the contextual information and enriches the complementary syntactic information using morphological tagging and dependency parsing as two auxiliary tasks. Experiments on the benchmark datasets for SaCTI show 6.1 points (Accuracy) and 7.7 points (F1-score) absolute gain compared to the state-of-the-art system. Further, our multi-lingual experiments demonstrate the efficacy of the proposed architecture in English and Marathi languages.

2019

pdf bib
Framework for Question-Answering in Sanskrit through Automated Construction of Knowledge Graphs
Hrishikesh Terdalkar | Arnab Bhattacharya
Proceedings of the 6th International Sanskrit Computational Linguistics Symposium