Varad Srivastava


2025

pdf bib
DweshVaani: An LLM for Detecting Religious Hate Speech in Code-Mixed Hindi-English
Varad Srivastava
Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025)

Traditional language models in NLP have been extensively made use of, in hateful speech detection problems. With the growth of social media, content in regional languages has grown exponentially. However, use of language models as well as LLMs on code-mixed Hindi-English hateful speech detection is under-explored. Our work addresses this gap by investigating both cutting-edge LLMs by Meta, Google, OpenAI, Nvidia as well as Indic-LLMs like Sarvam, Indic-Gemma, and Airavata on hateful speech detection in code-mixed Hindi-English languages in a comprehensive set of few-shot scenarios which include examples selected randomly, as well as with retrieval-augmented generation (RAG) based on MuRIL language model. We observed that Indic-LLMs which are instruction tuned on Indian content fall behind on the task. We also experimented with fine-tuning approaches, where we use knowledge-distillation based-finetuning by using extracted information about rationale behind hate speech, as part of the fine-tuning process. Finally, we also propose Dwesh-Vaani, an LLM based on fine-tuned Gemma-2, that out-performs all other approaches at the task of religious hateful speech detection as well as targeted religion identification in code-mixed Hindi-English languages.

2024

pdf bib
BAI-Arg LLM at the FinLLM Challenge Task: Earn While You Argue - Financial Argument Identification
Varad Srivastava
Proceedings of the Eighth Financial Technology and Natural Language Processing and the 1st Agent AI for Scenario Planning