Fine-grained Sexism Detection in Italian Newspapers

Federica Manzi, Leon Weber-Genzel, Barbara Plank


Abstract
In recent years, tasks revolving around hate speech detection have experienced a growing interest in the field of Natural Language Processing. Two main trends stand out in the context of sexism recognition: the focus on overt forms of sexism such as misogyny on social media and tackling the problem as a text classification task. The main objective of this work is to introduce a new approach to tackle sexism recognition as a sequence labelling task, operating on the token level rather than the document one. To achieve this goal, we introduce (i) the FGSDI (Fine-Grained Sexism Detection in Italian) corpus, containing Italian newspaper articles annotated with fine-grained linguistic markers of sexism, and (ii) a two-step pipeline that sequentially performs sexism detection on the sentence level and sexism classification on the token one. Our primary findings include that (i) tackling the task of sexism recognition as a sequence labelling task is possible, however, a large amount of labelled data is needed; (ii) leveraging few-shot learning for sexism detection proves to be an effective solution in scenarios where only a limited amount of data are available; (iii) the proposed pipeline approach allows for better results compared to the baseline by doubling the overall precision and achieving a better F1-score.
Anthology ID:
2024.clicit-1.66
Volume:
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
Month:
December
Year:
2024
Address:
Pisa, Italy
Editors:
Felice Dell'Orletta, Alessandro Lenci, Simonetta Montemagni, Rachele Sprugnoli
Venue:
CLiC-it
SIG:
Publisher:
CEUR Workshop Proceedings
Note:
Pages:
556–583
Language:
URL:
https://aclanthology.org/2024.clicit-1.66/
DOI:
Bibkey:
Cite (ACL):
Federica Manzi, Leon Weber-Genzel, and Barbara Plank. 2024. Fine-grained Sexism Detection in Italian Newspapers. In Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), pages 556–583, Pisa, Italy. CEUR Workshop Proceedings.
Cite (Informal):
Fine-grained Sexism Detection in Italian Newspapers (Manzi et al., CLiC-it 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.clicit-1.66.pdf