Using Gender- and Polarity-Informed Models to Investigate Bias

Samia Touileb, Lilja Øvrelid, Erik Velldal


Abstract
In this work we explore the effect of incorporating demographic metadata in a text classifier trained on top of a pre-trained transformer language model. More specifically, we add information about the gender of critics and book authors when classifying the polarity of book reviews, and the polarity of the reviews when classifying the genders of authors and critics. We use an existing data set of Norwegian book reviews with ratings by professional critics, which has also been augmented with gender information, and train a document-level sentiment classifier on top of a recently released Norwegian BERT-model. We show that gender-informed models obtain substantially higher accuracy, and that polarity-informed models obtain higher accuracy when classifying the genders of book authors. For this particular data set, we take this result as a confirmation of the gender bias in the underlying label distribution, but in other settings we believe a similar approach can be used for mitigating bias in the model.
Anthology ID:
2021.gebnlp-1.8
Volume:
Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | GeBNLP | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
66–74
Language:
URL:
https://aclanthology.org/2021.gebnlp-1.8
DOI:
10.18653/v1/2021.gebnlp-1.8
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.gebnlp-1.8.pdf