Leila Moudjari

2025

Are Dialects Better Prompters? A Case Study on Arabic Subjective Text Classification
Leila Moudjari | Farah Benamara
Findings of the Association for Computational Linguistics: ACL 2025

This paper investigates the effect of dialectal prompting, variations in prompting scrip t and model fine-tuning on subjective classification in Arabic dialects. To this end, we evaluate the performances of 12 widely used open LLMs across four tasks and eight benchmark datasets. Our results reveal that specialized fine-tuned models with Arabic and Arabizi scripts dialectal prompts achieve the best results, which constitutes a novel state of the art in the field.

pdf bib

DesCartes-HOPE at MAHED Shared task 2025: Integrating Pragmatic Features for Arabic Hope and Hate Speech Detection
Leila Moudjari | Hacène-Cherkaski Mélissa | Farah Benamara
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

2024

pdf bib abs

In emergency situations users of social networks convey all sorts of what have been called communicative intentions, well-known since the work of Austin (1962) and Searle (1969) as speech acts (SA). While speech acts have been the focus of close scrutiny in the philosophical and linguistic literature (see (Portner, 2018) for extended discussion), their role has been only rarely understood and exploited in processing social media content about crisis events, our focus here. Current work on communicative intentions in social media are topic-oriented, focusing on the correlation between SA and specific topics such as crisis (e.g., earthquakes) but also politics, celebrities, cooking, travel, etc. It has been observed that people globally tend to react to natural disasters with SA distinct from those used in other contexts (e.g., celebrities, which are essentially made up of comments). Here, we explore the further hypothesis of a correlation between different SA types and urgency and propose an in depth linguistic and computational analysis of communicative intentions in tweets from an urgency-oriented perspective. Indeed, SA are mostly relevant to identify intentions, desires, plans and preferences towards action and to ultimately produce a system intended to help rescue teams. Our contribution is four-fold and consists of: (1) A two-layer annotation scheme of speech acts both at the tweet and sub-tweet levels, (2) A new French dataset of about 13K tweets annotated for both urgency and SA, targeting both expected (e.g., storms) and unexpected or sudden (e.g., building collapse, explosion) events, (3) A thorough analysis of the annotations studying in particular the correlation between SA and the urgency of the message, SA and intentions to act categories (e.g., human damages), and SA and crisis types, finally, (4) A set of deep learning experiments to detect SA in crises related corpora. Our results show a strong correlation between SA and urgency annotations at both the tweet and sub-tweet levels with a particular salient correlation in the latter case, which constitutes a first important step towards SA-aware NLP-based crisis management on social media.

pdf bib abs

Dans cet article, nous présentons notre contribution à la tâche de classification des émotions dans la parole dans le cadre de notre participation à la campagne d’évaluation Odyssey 2024. Nous proposons un système hybride qui tire parti à la fois des informations du signal audio et des informations sémantiques issues des transcriptions automatiques. Les résultats montrent que l’ajout de l’information sémantique permet de dépasser les systèmes uniquement audio.

2023

pdf bib abs

Classification de tweets en situation d’urgence pour la gestion de crises
Romain Meunier | Leila Moudjari | Farah Benamara | Véronique Moriceau | Alda Mari | Patricia Stolf
Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : travaux de recherche originaux -- articles longs

Le traitement de données provenant de réseaux sociaux en temps réel est devenu une outil attractifdans les situations d’urgence, mais la surcharge d’informations reste un défi à relever. Dans cet article,nous présentons un nouveau jeu de données en français annoté manuellement pour la gestion de crise.Nous testons également plusieurs modèles d’apprentissage automatique pour classer des tweets enfonction de leur pertinence, de l’urgence et de l’intention qu’ils véhiculent afin d’aider au mieux lesservices de secours durant les crises selon des méthodes d’évaluation spécifique à la gestion de crise.Nous évaluons également nos modèles lorsqu’ils sont confrontés à de nouvelles crises ou même denouveaux types de crises, avec des résultats encourageants

2020

pdf bib abs

An Algerian Corpus and an Annotation Platform for Opinion and Emotion Analysis
Leila Moudjari | Karima Akli-Astouati | Farah Benamara
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this paper, we address the lack of resources for opinion and emotion analysis related to North African dialects, targeting Algerian dialect. We present TWIFIL (TWItter proFILing) a collaborative annotation platform for crowdsourcing annotation of tweets at different levels of granularity. The plateform allowed the creation of the largest Algerian dialect dataset annotated for both sentiment (9,000 tweets), emotion (about 5,000 tweets) and extra-linguistic information including author profiling (age and gender). The annotation resulted also in the creation of the largest Algerien dialect subjectivity lexicon of about 9,000 entries which can constitute a valuable resources for the development of future NLP applications for Algerian dialect. To test the validity of the dataset, a set of deep learning experiments were conducted to classify a given tweet as positive, negative or neutral. We discuss our results and provide an error analysis to better identify classification errors.

Leila Moudjari

2025

2024

2023

2020

Co-authors

Venues