Kaustubh Agarwal


2021

pdf bib
Humor Generation and Detection in Code-Mixed Hindi-English
Kaustubh Agarwal | Rhythm Narula
Proceedings of the Student Research Workshop Associated with RANLP 2021

Computational humor generation is one of the hardest tasks in natural language generation, especially in code-mixed languages. Existing research has shown that humor generation in English is a promising avenue. However, studies have shown that bilingual speakers often appreciate humor more in code-mixed languages with unexpected transitions and clever word play. In this study, we propose several methods for generating and detecting humor in code-mixed Hindi-English. Of the experimented approaches, an Attention Based Bi-Directional LSTM with converting parts of text on a word2vec embedding gives the best results by generating 74.8% good jokes and IndicBERT used for detecting humor in code-mixed Hindi-English outperforms other humor detection methods with an accuracy of 96.98%.

pdf bib
Deep Learning Based Approach For Detecting Suicidal Ideation in Hindi-English Code-Mixed Text: Baseline and Corpus
Kaustubh Agarwal | Bhavya Dhingra
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

Suicide rates are rising among the youth, and the high association with suicidal ideation expression on social media necessitates further research into models for detecting suicidal ideation in text, such as tweets, to enable mitigation. Existing research has proven the feasibility of detecting suicidal ideation on social media in a particular language. However, studies have shown that bilingual and multilingual speakers tend to use code-mixed text on social media making the detection of suicidal ideation on code-mixed data crucial, even more so with the increasing number of bilingual and multilingual speakers. In this study we create a code-mixed Hindi-English (Hinglish) dataset for detection of suicidal ideation and evaluate the performance of traditional classifiers, deep learning architectures, and transformers on it. Among the tested classifier architectures, Indic BERT gave the best results with an accuracy of 98.54%.