Scott Thomas Andersen
HeteroCorpus: A Corpus for Heteronormative Language Detection
Juan Vásquez | Gemma Bel-Enguix | Scott Thomas Andersen | Sergio-Luis Ojeda-Trueba
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
In recent years, plenty of work has been done by the NLP community regarding gender bias detection and mitigation in language systems. Yet, to our knowledge, no one has focused on the difficult task of heteronormative language detection and mitigation. We consider this an urgent issue, since language technologies are growing increasingly present in the world and, as it has been proven by various studies, NLP systems with biases can create real-life adverse consequences for women, gender minorities and racial minorities and queer people. For these reasons, we propose and evaluate HeteroCorpus; a corpus created specifically for studying heterononormative language in English. Additionally, we propose a baseline set of classification experiments on our corpus, in order to show the performance of our corpus in classification tasks.