ArMIS - The Arabic Misogyny and Sexism Corpus with Annotator Subjective Disagreements

Dina Almanea, Massimo Poesio


Abstract
The use of misogynistic and sexist language has increased in recent years in social media, and is increasing in the Arabic world in reaction to reforms attempting to remove restrictions on women lives. However, there are few benchmarks for Arabic misogyny and sexism detection, and in those the annotations are in aggregated form even though misogyny and sexism judgments are found to be highly subjective. In this paper we introduce an Arabic misogyny and sexism dataset (ArMIS) characterized by providing annotations from annotators with different degree of religious beliefs, and provide evidence that such differences do result in disagreements. To the best of our knowledge, this is the first dataset to study in detail the effect of beliefs on misogyny and sexism annotation. We also discuss proof-of-concept experiments showing that a dataset in which disagreements have not been reconciled can be used to train state-of-the-art models for misogyny and sexism detection; and consider different ways in which such models could be evaluated.
Anthology ID:
2022.lrec-1.244
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2282–2291
Language:
URL:
https://aclanthology.org/2022.lrec-1.244
DOI:
Bibkey:
Cite (ACL):
Dina Almanea and Massimo Poesio. 2022. ArMIS - The Arabic Misogyny and Sexism Corpus with Annotator Subjective Disagreements. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2282–2291, Marseille, France. European Language Resources Association.
Cite (Informal):
ArMIS - The Arabic Misogyny and Sexism Corpus with Annotator Subjective Disagreements (Almanea & Poesio, LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.244.pdf