Gender and sentiment, critics and authors: a dataset of Norwegian book reviews

Samia Touileb, Lilja Øvrelid, Erik Velldal


Abstract
Gender bias in models and datasets is widely studied in NLP. The focus has usually been on analysing how females and males express themselves, or how females and males are described. However, a less studied aspect is the combination of these two perspectives, how female and male describe the same or opposite gender. In this paper, we present a new gender annotated sentiment dataset of critics reviewing the works of female and male authors. We investigate if this newly annotated dataset contains differences in how the works of male and female authors are critiqued, in particular in terms of positive and negative sentiment. We also explore the differences in how this is done by male and female critics. We show that there are differences in how critics assess the works of authors of the same or opposite gender. For example, male critics rate crime novels written by females, and romantic and sentimental works written by males, more negatively.
Anthology ID:
2020.gebnlp-1.11
Volume:
Proceedings of the Second Workshop on Gender Bias in Natural Language Processing
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Marta R. Costa-jussà, Christian Hardmeier, Will Radford, Kellie Webster
Venue:
GeBNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
125–138
Language:
URL:
https://aclanthology.org/2020.gebnlp-1.11
DOI:
Bibkey:
Cite (ACL):
Samia Touileb, Lilja Øvrelid, and Erik Velldal. 2020. Gender and sentiment, critics and authors: a dataset of Norwegian book reviews. In Proceedings of the Second Workshop on Gender Bias in Natural Language Processing, pages 125–138, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
Gender and sentiment, critics and authors: a dataset of Norwegian book reviews (Touileb et al., GeBNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.gebnlp-1.11.pdf
Code
 ltgoslo/norec_gender
Data
NoReC