Andrea Marra
2024
From Hate Speech to Societal Empowerment: A Pedagogical Journey Through Computational Thinking and NLP for High School Students
Alessandra Teresa Cignarella
|
Elisa Chierchiello
|
Chiara Ferrando
|
Simona Frenda
|
Soda Marem Lo
|
Andrea Marra
Proceedings of the Sixth Workshop on Teaching NLP
The teaching laboratory we have created integrates methodologies to address the topic of hate speech on social media among students while fostering computational thinking and AI education for societal impact. We provide a foundational understanding of hate speech and introduce computational concepts using matrices, bag of words, and practical exercises in platforms like Colaboratory. Additionally, we emphasize the application of AI, particularly in NLP, to address real-world challenges. Through retrospective evaluation, we assess the efficacy of our approach, aiming to empower students as proactive contributors to societal betterment. With this paper we present an overview of the laboratory’s structure, the primary materials used, and insights gleaned from six editions conducted to the present date.
QUEEREOTYPES: A Multi-Source Italian Corpus of Stereotypes towards LGBTQIA+ Community Members
Alessandra Teresa Cignarella
|
Manuela Sanguinetti
|
Simona Frenda
|
Andrea Marra
|
Cristina Bosco
|
Valerio Basile
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
The paper describes a dataset composed of two sub-corpora from two different sources in Italian. The QUEEREOTYPES corpus includes social media texts regarding LGBTQIA+ individuals, behaviors, ideology and events. The texts were collected from Facebook and Twitter in 2018 and were annotated for the presence of stereotypes, and orthogonal dimensions (such as hate speech, aggressiveness, offensiveness, and irony in one sub-corpus, and stance in the other). The resource was developed by Natural Language Processing researchers together with activists from an Italian LGBTQIA+ not-for-profit organization. The creation of the dataset allows the NLP community to study stereotypes against marginalized groups, individuals and, ultimately, to develop proper tools and measures to reduce the online spread of such stereotypes. A test for the robustness of the language resource has been performed by means of 5-fold cross-validation experiments. Finally, text classification experiments have been carried out with a fine-tuned version of AlBERTo (a BERT-based model pre-trained on Italian tweets) and mBERT, obtaining good results on the task of stereotype detection, suggesting that stereotypes towards different targets might share common traits.
Search