Ramaravind Kommiya Mothilal

2026

Argument-Based Consistency in Toxicity Explanations of LLMs
Ramaravind Kommiya Mothilal | Joanna Roy | Syed Ishtiaque Ahmed | Shion Guha
Findings of the Association for Computational Linguistics: EACL 2026

The discourse around toxicity and LLMs in NLP largely revolves around detection tasks. This work shifts the focus to evaluating LLMs’ *reasoning* about toxicity—from their explanations that justify a stance—to enhance their trustworthiness in downstream tasks. Despite extensive research on explainability, it is not straightforward to adopt existing methods to evaluate free-form toxicity explanation due to their over-reliance on input text perturbations, among other challenges. To account for these, we propose a novel, theoretically-grounded multi-dimensional criterion, **Argument-based Consistency (ArC)**, that measures the extent to which LLMs’ free-form toxicity explanations reflect an ideal and logical argumentation process. Based on uncertainty quantification, we develop six metrics for ArC to comprehensively evaluate the (in)consistencies in LLMs’ toxicity explanations. We conduct several experiments on three Llama models (of size up to 70B) and an 8B Ministral model on five diverse toxicity datasets. Our results show that while LLMs generate plausible explanations to simple prompts, their reasoning about toxicity breaks down when prompted about the nuanced relations between the complete set of reasons, the individual reasons, and their toxicity stances, resulting in inconsistent and irrelevant responses. We open-source our [code](https://github.com/uofthcdslab/ArC) and [LLM-generated explanations](https://huggingface.co/collections/uofthcdslab/arc) for future works.

2020

pdf bib abs

The primary obstacle to developing technologies for low-resource languages is the lack of usable data. In this paper, we report the adaption and deployment of 4 technology-driven methods of data collection for Gondi, a low-resource vulnerable language spoken by around 2.3 million tribal people in south and central India. In the process of data collection, we also help in its revival by expanding access to information in Gondi through the creation of linguistic resources that can be used by the community, such as a dictionary, children’s stories, an app with Gondi content from multiple sources and an Interactive Voice Response (IVR) based mass awareness platform. At the end of these interventions, we collected a little less than 12,000 translated words and/or sentences and identified more than 650 community members whose help can be solicited for future translation efforts. The larger goal of the project is collecting enough data in Gondi to build and deploy viable language technologies like machine translation and speech to text systems that can help take the language onto the internet.

Co-authors

Brij Mohan Lal Srivastava 1

Venkanna U 1

Venues

Findings1
LREC1

Fix author