Saurabh Kumar Pandey


2025

pdf bib
CULTURALLY YOURS: A Reading Assistant for Cross-Cultural Content
Saurabh Kumar Pandey | Harshit Budhiraja | Sougata Saha | Monojit Choudhury
Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations

Users from diverse cultural backgrounds frequently face challenges in understanding content from various online sources that are written by people from a different culture. This paper presents CULTURALLY YOURS (CY), a first-of-its-kind cultural reading assistant tool designed to identify culture-specific items (CSIs) for users from varying cultural contexts. By leveraging principles of relevance feedback and using culture as a prior, our tool personalizes to the user’s preferences based on the interaction of the user with the tool. CY can be powered by any LLM that can reason with cultural background of the user and the input text in English, provided as a part of the prompt that are iteratively refined as the user keeps interacting with the system. In this demo, we use GPT-4o as the back-end. We conduct a user study across 13 users from 8 different geographies. The results demonstrate CY’s effectiveness in enhancing user engagement and personalization alongside comprehension of cross-cultural content.

2024

pdf bib
Evaluating ChatGPT against Functionality Tests for Hate Speech Detection
Mithun Das | Saurabh Kumar Pandey | Animesh Mukherjee
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Large language models like ChatGPT have recently shown a great promise in performing several tasks, including hate speech detection. However, it is crucial to comprehend the limitations of these models to build robust hate speech detection systems. To bridge this gap, our study aims to evaluate the strengths and weaknesses of the ChatGPT model in detecting hate speech at a granular level across 11 languages. Our evaluation employs a series of functionality tests that reveals various intricate failures of the model which the aggregate metrics like macro F1 or accuracy are not able to unfold. In addition, we investigate the influence of complex emotions, such as the use of emojis in hate speech, on the performance of the ChatGPT model. Our analysis highlights the shortcomings of the generative models in detecting certain types of hate speech and highlighting the need for further research and improvements in the workings of these models.