Jwala Dhamala


2022

pdf bib
Measuring Fairness of Text Classifiers via Prediction Sensitivity
Satyapriya Krishna | Rahul Gupta | Apurv Verma | Jwala Dhamala | Yada Pruksachatkun | Kai-Wei Chang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

With the rapid growth in language processing applications, fairness has emerged as an important consideration in data-driven solutions. Although various fairness definitions have been explored in the recent literature, there is lack of consensus on which metrics most accurately reflect the fairness of a system. In this work, we propose a new formulation – accumulated prediction sensitivity, which measures fairness in machine learning models based on the model’s prediction sensitivity to perturbations in input features. The metric attempts to quantify the extent to which a single prediction depends on a protected attribute, where the protected attribute encodes the membership status of an individual in a protected group. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness. It also correlates well with humans’ perception of fairness. We conduct experiments on two text classification datasets – Jigsaw Toxicity, and Bias in Bios, and evaluate the correlations between metrics and manual annotations on whether the model produced a fair outcome. We observe that the proposed fairness metric based on prediction sensitivity is statistically significantly more correlated with human annotation than the existing counterfactual fairness metric.

pdf bib
On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations
Yang Cao | Yada Pruksachatkun | Kai-Wei Chang | Rahul Gupta | Varun Kumar | Jwala Dhamala | Aram Galstyan
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Multiple metrics have been introduced to measure fairness in various natural language processing tasks. These metrics can be roughly categorized into two categories: 1) extrinsic metrics for evaluating fairness in downstream applications and 2) intrinsic metrics for estimating fairness in upstream contextualized language representation models. In this paper, we conduct an extensive correlation study between intrinsic and extrinsic metrics across bias notions using 19 contextualized language models. We find that intrinsic and extrinsic metrics do not necessarily correlate in their original setting, even when correcting for metric misalignments, noise in evaluation datasets, and confounding factors such as experiment configuration for extrinsic metrics.

pdf bib
Proceedings of the 2nd Workshop on Trustworthy Natural Language Processing (TrustNLP 2022)
Apurv Verma | Yada Pruksachatkun | Kai-Wei Chang | Aram Galstyan | Jwala Dhamala | Yang Trista Cao
Proceedings of the 2nd Workshop on Trustworthy Natural Language Processing (TrustNLP 2022)

pdf bib
Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal
Umang Gupta | Jwala Dhamala | Varun Kumar | Apurv Verma | Yada Pruksachatkun | Satyapriya Krishna | Rahul Gupta | Kai-Wei Chang | Greg Ver Steeg | Aram Galstyan
Findings of the Association for Computational Linguistics: ACL 2022

Language models excel at generating coherent text, and model compression techniques such as knowledge distillation have enabled their use in resource-constrained settings. However, these models can be biased in multiple ways, including the unfounded association of male and female genders with gender-neutral professions. Therefore, knowledge distillation without any fairness constraints may preserve or exaggerate the teacher model’s biases onto the distilled model. To this end, we present a novel approach to mitigate gender disparity in text generation by learning a fair model during knowledge distillation. We propose two modifications to the base knowledge distillation based on counterfactual role reversal—modifying teacher probabilities and augmenting the training set. We evaluate gender polarity across professions in open-ended text generated from the resulting distilled and finetuned GPT–2 models and demonstrate a substantial reduction in gender disparity with only a minor compromise in utility. Finally, we observe that language models that reduce gender polarity in language generation do not improve embedding fairness or downstream classification fairness.

2021

pdf bib
Does Robustness Improve Fairness? Approaching Fairness with Word Substitution Robustness Methods for Text Classification
Yada Pruksachatkun | Satyapriya Krishna | Jwala Dhamala | Rahul Gupta | Kai-Wei Chang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Proceedings of the First Workshop on Trustworthy Natural Language Processing
Yada Pruksachatkun | Anil Ramakrishna | Kai-Wei Chang | Satyapriya Krishna | Jwala Dhamala | Tanaya Guha | Xiang Ren
Proceedings of the First Workshop on Trustworthy Natural Language Processing

2020

pdf bib
Evaluating the Effectiveness of Efficient Neural Architecture Search for Sentence-Pair Tasks
Ansel MacLaughlin | Jwala Dhamala | Anoop Kumar | Sriram Venkatapathy | Ragav Venkatesan | Rahul Gupta
Proceedings of the First Workshop on Insights from Negative Results in NLP

Neural Architecture Search (NAS) methods, which automatically learn entire neural model or individual neural cell architectures, have recently achieved competitive or state-of-the-art (SOTA) performance on variety of natural language processing and computer vision tasks, including language modeling, natural language inference, and image classification. In this work, we explore the applicability of a SOTA NAS algorithm, Efficient Neural Architecture Search (ENAS) (Pham et al., 2018) to two sentence pair tasks, paraphrase detection and semantic textual similarity. We use ENAS to perform a micro-level search and learn a task-optimized RNN cell architecture as a drop-in replacement for an LSTM. We explore the effectiveness of ENAS through experiments on three datasets (MRPC, SICK, STS-B), with two different models (ESIM, BiLSTM-Max), and two sets of embeddings (Glove, BERT). In contrast to prior work applying ENAS to NLP tasks, our results are mixed – we find that ENAS architectures sometimes, but not always, outperform LSTMs and perform similarly to random architecture search.