A Metric Learning Approach to Misogyny Categorization

Juan Manuel Coria, Sahar Ghannay, Sophie Rosset, Hervé Bredin


Abstract
The task of automatic misogyny identification and categorization has not received as much attention as other natural language tasks have, even though it is crucial for identifying hate speech in social Internet interactions. In this work, we address this sentence classification task from a representation learning perspective, using both a bidirectional LSTM and BERT optimized with the following metric learning loss functions: contrastive loss, triplet loss, center loss, congenerous cosine loss and additive angular margin loss. We set new state-of-the-art for the task with our fine-tuned BERT, whose sentence embeddings can be compared with a simple cosine distance, and we release all our code as open source for easy reproducibility. Moreover, we find that almost every loss function performs equally well in this setting, matching the regular cross entropy loss.
Anthology ID:
2020.repl4nlp-1.12
Volume:
Proceedings of the 5th Workshop on Representation Learning for NLP
Month:
July
Year:
2020
Address:
Online
Editors:
Spandana Gella, Johannes Welbl, Marek Rei, Fabio Petroni, Patrick Lewis, Emma Strubell, Minjoon Seo, Hannaneh Hajishirzi
Venue:
RepL4NLP
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
89–94
Language:
URL:
https://aclanthology.org/2020.repl4nlp-1.12
DOI:
10.18653/v1/2020.repl4nlp-1.12
Bibkey:
Cite (ACL):
Juan Manuel Coria, Sahar Ghannay, Sophie Rosset, and Hervé Bredin. 2020. A Metric Learning Approach to Misogyny Categorization. In Proceedings of the 5th Workshop on Representation Learning for NLP, pages 89–94, Online. Association for Computational Linguistics.
Cite (Informal):
A Metric Learning Approach to Misogyny Categorization (Coria et al., RepL4NLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.repl4nlp-1.12.pdf
Video:
 http://slideslive.com/38929778