Koustav Rudra


2023

pdf bib
LLMs – the Good, the Bad or the Indispensable?: A Use Case on Legal Statute Prediction and Legal Judgment Prediction on Indian Court Cases
Shaurya Vats | Atharva Zope | Somsubhra De | Anurag Sharma | Upal Bhattacharya | Shubham Kumar Nigam | Shouvik Guha | Koustav Rudra | Kripabandhu Ghosh
Findings of the Association for Computational Linguistics: EMNLP 2023

The Large Language Models (LLMs) have impacted many real-life tasks. To examine the efficacy of LLMs in a high-stake domain like law, we have applied state-of-the-art LLMs for two popular tasks: Statute Prediction and Judgment Prediction, on Indian Supreme Court cases. We see that while LLMs exhibit excellent predictive performance in Statute Prediction, their performance dips in Judgment Prediction when compared with many standard models. The explanations generated by LLMs (along with prediction) are of moderate to decent quality. We also see evidence of gender and religious bias in the LLM-predicted results. In addition, we present a note from a senior legal expert on the ethical concerns of deploying LLMs in these critical legal tasks.

2016

pdf bib
Understanding Language Preference for Expression of Opinion and Sentiment: What do Hindi-English Speakers do on Twitter?
Koustav Rudra | Shruti Rijhwani | Rafiya Begum | Kalika Bali | Monojit Choudhury | Niloy Ganguly
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Functions of Code-Switching in Tweets: An Annotation Framework and Some Initial Experiments
Rafiya Begum | Kalika Bali | Monojit Choudhury | Koustav Rudra | Niloy Ganguly
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Code-Switching (CS) between two languages is extremely common in communities with societal multilingualism where speakers switch between two or more languages when interacting with each other. CS has been extensively studied in spoken language by linguists for several decades but with the popularity of social-media and less formal Computer Mediated Communication, we now see a big rise in the use of CS in the text form. This poses interesting challenges and a need for computational processing of such code-switched data. As with any Computational Linguistic analysis and Natural Language Processing tools and applications, we need annotated data for understanding, processing, and generation of code-switched language. In this study, we focus on CS between English and Hindi Tweets extracted from the Twitter stream of Hindi-English bilinguals. We present an annotation scheme for annotating the pragmatic functions of CS in Hindi-English (Hi-En) code-switched tweets based on a linguistic analysis and some initial experiments.