Abhigyan Ghosh


2022

pdf bib
English To Indian Sign Language:Rule-Based Translation System Along With Multi-Word Expressions and Synonym Substitution
Abhigyan Ghosh | Radhika Mamidi
Proceedings of the 19th International Conference on Natural Language Processing (ICON)

The hearing challenged communities all over the world face difficulties to communicate with others. Machine translation has been one of the prominent technologies to facilitate communication with the deaf and hard of hearing community worldwide. We have explored and formulated the fundamental rules of Indian Sign Language(ISL) and implemented them as a translation mechanism of English Text to Indian Sign Language glosses. According to the formulated rules and sub-rules, the source text structure is identified and transferred to the target ISL gloss. This target language is such that it can be easily converted to videos using the Indian Sign Language dictionary. This research work also mentions the intermediate phases of the transfer process and innovations in the process such as Multi-Word Expression detection and synonym substitution to handle the limited vocabulary size of Indian Sign Language while producing semantically accurate translations.

2020

pdf bib
SUKHAN: Corpus of Hindi Shayaris annotated with Sentiment Polarity Information
Salil Aggarwal | Abhigyan Ghosh | Radhika Mamidi
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

Shayari is a form of poetry mainly popular in the Indian subcontinent, in which the poet expresses his emotions and feelings in a very poetic manner. It is one of the best ways to express our thoughts and opinions. Therefore, it is of prime importance to have an annotated corpus of Hindi shayaris for the task of sentiment analysis. In this paper, we introduce SUKHAN, a dataset consisting of Hindi shayaris along with sentiment polarity labels. To the best of our knowledge, this is the first corpus of Hindi shayaris annotated with sentiment polarity information. This corpus contains a total of 733 Hindi shayaris of various genres. Also, this dataset is of utmost value as all the annotation is done manually by five annotators and this makes it a very rich dataset for training purposes. This annotated corpus is also used to build baseline sentiment classification models using machine learning techniques.