CyberAgressionAdo-v1: a Dataset of Annotated Online Aggressions in French Collected through a Role-playing Game
Anaïs Ollagnier | Elena Cabrio | Serena Villata | Catherine Blaya
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Over the past decades, the number of episodes of cyber aggression occurring online has grown substantially, especially among teens. Most solutions investigated by the NLP community to curb such online abusive behaviors consist of supervised approaches relying on annotated data extracted from social media. However, recent studies have highlighted that private instant messaging platforms are major mediums of cyber aggression among teens. As such interactions remain invisible due to the app privacy policies, very few datasets collecting aggressive conversations are available for the computational analysis of language. In order to overcome this limitation, in this paper we present the CyberAgressionAdo-V1 dataset, containing aggressive multiparty chats in French collected through a role-playing game in high-schools, and annotated at different layers. We describe the data collection and annotation phases, carried out in the context of a EU and a national research projects, and provide insightful analysis on the different types of aggression and verbal abuse depending on the targeted victims (individuals or communities) emerging from the collected data.
Impact of the nature and size of the training set on performance in the automatic detection of named entities (Impact de la nature et de la taille des corpus d’apprentissage sur les performances dans la détection automatique des entités nommées) [in French]
Anaïs Ollagnier | Sébastien Fournier | Patrice Bellot | Frédéric Béchet
Proceedings of TALN 2014 (Volume 2: Short Papers)
- Sébastien Fournier 1
- Patrice Bellot 1
- Frédéric Bechet 1
- Elena Cabrio 1
- Serena Villata 1
- show all...