Somaiyeh Dehghan

2024

pdf bib abs
Evaluating ChatGPT’s Ability to Detect Hate Speech in Turkish Tweets
Somaiyeh Dehghan | Berrin Yanikoglu
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)

ChatGPT, developed by OpenAI, has made a significant impact on the world, mainly on how people interact with technology. In this study, we evaluate ChatGPT’s ability to detect hate speech in Turkish tweets and measure its strength using zero- and few-shot paradigms and compare the results to the supervised fine-tuning BERT model. On evaluations with the SIU2023-NST dataset, ChatGPT achieved 65.81% accuracy in detecting hate speech for the few-shot setting, while BERT with supervised fine-tuning achieved 82.22% accuracy. This results supports previous findings that show that, despite its much smaller size, BERT is more suitable for natural language classifications tasks such as hate speech detection.

pdf bib abs
Overview of the Hate Speech Detection in Turkish and Arabic Tweets (HSD-2Lang) Shared Task at CASE 2024
Gökçe Uludoğan | Somaiyeh Dehghan | Inanc Arin | Elif Erol | Berrin Yanikoglu | Arzucan Özgür
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)

This paper offers an overview of Hate Speech Detection in Turkish and Arabic Tweets (HSD-2Lang) Shared Task at CASE workshop to be held jointly with EACL 2024. The task was divided into two subtasks: Subtask A, targeting hate speech detection in various Turkish contexts, and Subtask B, addressing hate speech detection in Arabic with limited data. The shared task attracted significant attention with 33 teams that registered and 10 teams that participated in at least one task. In this paper, we provide the details of the tasks and the approaches adopted by the participant along with an analysis of the results obtained from this shared task.

pdf bib abs
A Concise Report of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text
Ali Hürriyetoğlu | Surendrabikram Thapa | Gökçe Uludoğan | Somaiyeh Dehghan | Hristo Tanev
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)

In this paper, we provide a brief overview of the 7th workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) co-located with EACL 2024. This workshop consisted of regular papers, system description papers submitted by shared task participants, and overview papers of shared tasks held. This workshop series has been bringing together experts and enthusiasts from technical and social science fields, providing a platform for better understanding event information. This workshop not only advances text-based event extraction but also facilitates research in event extraction in multimodal settings.

pdf bib abs
Multi-domain Hate Speech Detection Using Dual Contrastive Learning and Paralinguistic Features
Somaiyeh Dehghan | Berrin Yanıkoğlu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Social networks have become venues where people can share and spread hate speech, especially when the platforms allow users to remain anonymous. Hate speech can have significant social and cultural effects, especially when it targets specific groups of people in terms of religion, race, ethnicity, culture or a specific social situation such as immigrants and refugees. In this study, we propose a hate speech detection model, BERTurk-DualCL, using a mixed objective with contrastive learning loss that is combined with the traditional cross-entropy loss used for classification. In addition, we study the effects of paralinguistic features, namely emojis and hashtags, on the performance of our model. We trained and evaluated our model on tweets in four different topics with heated discussions from two separate datasets, ranging from discussions about migrants to the Israel-Palestine conflict. Our multi-domain model outperforms comparable results in literature and the average results of four domain-specific models, achieving a macro-F1 score of 81.04% and 58.89% on two- and five-class tasks respectively.

Co-authors

Hristo Tanev 1

Surendrabikram Thapa 1

Arzucan Özgür 1

Venues

Fix author