Nasim Sobhani
2024
Towards Fairer NLP Models: Handling Gender Bias In Classification Tasks
Nasim Sobhani
|
Sarah Delany
Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Measuring and mitigating gender bias in natural language processing (NLP) systems is crucial to ensure fair and ethical AI. However, a key challenge is the lack of explicit gender information in many textual datasets. This paper proposes two techniques, Identity Term Sampling (ITS) and Identity Term Pattern Extraction (ITPE), as alternatives to template-based approaches for measuring gender bias in text data. These approaches identify test data for measuring gender bias in the dataset itself and can be used to measure gender bias on any NLP classifier. We demonstrate the use of these approaches for measuring gender bias across various NLP classification tasks, including hate speech detection, fake news identification, and sentiment analysis. Additionally, we show how these techniques can benefit gender bias mitigation, proposing a variant of Counterfactual Data Augmentation (CDA), called Gender-Selective CDA (GS-CDA), which reduces the amount of data augmentation required in training data while effectively mitigating gender bias and maintaining overall classification performance.
2023
Measuring Gender Bias in Natural Language Processing: Incorporating Gender-Neutral Linguistic Forms for Non-Binary Gender Identities in Abusive Speech Detection
Nasim Sobhani
|
Kinshuk Sengupta
|
Sarah Jane Delany
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Predictions from machine learning models can reflect bias in the data on which they are trained. Gender bias has been shown to be prevalent in natural language processing models. The research into identifying and mitigating gender bias in these models predominantly considers gender as binary, male and female, neglecting the fluidity and continuity of gender as a variable. In this paper, we present an approach to evaluate gender bias in a prediction task, which recognises the non-binary nature of gender. We gender-neutralise a random subset of existing real-world hate speech data. We extend the existing template approach for measuring gender bias to include test examples that are gender-neutral. Measuring the bias across a selection of hate speech datasets we show that the bias for the gender-neutral data is closer to that seen for test instances that identify as male than those that identify as female.