Kamel Gaanoun

2022

SI2M & AIOX Labs at WANLP 2022 Shared Task: Propaganda Detection in Arabic, A Data Augmentation and Name Entity Recognition Approach
Kamel Gaanoun | Imade Benelallam
Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP)

This paper presents SI2M & AIOX Labs work among the propaganda detection in Arabic text shared task. The objective of this challenge is to identify the propaganda techniques used in specific propaganda fragments. We use a combination of data augmentation, Name Entity Recognition, rule-based repetition detection, and ARBERT prediction to develop our system. The model we provide scored 0.585 micro F1-Score and ranked 6th out of 12 teams.

2021

pdf bib abs

Sarcasm and Sentiment Detection in Arabic language A Hybrid Approach Combining Embeddings and Rule-based Features
Kamel Gaanoun | Imade Benelallam
Proceedings of the Sixth Arabic Natural Language Processing Workshop

This paper presents the ArabicProcessors team’s system designed for sarcasm (subtask 1) and sentiment (subtask 2) detection shared task. We created a hybrid system by combining rule-based features and both static and dynamic embeddings using transformers and deep learning. The system’s architecture is an ensemble of Naive bayes, MarBERT and Mazajak embedding. This process scored an F1-score of 51% on sarcasm and 71% for sentiment detection.

2020

pdf bib abs

Arabic dialect identification: An Arabic-BERT model with data augmentation and ensembling strategy
Kamel Gaanoun | Imade Benelallam
Proceedings of the Fifth Arabic Natural Language Processing Workshop

This paper presents the ArabicProcessors team’s deep learning system designed for the NADI 2020 Subtask 1 (country-level dialect identification) and Subtask 2 (province-level dialect identification). We used Arabic-Bert in combination with data augmentation and ensembling methods. Unlabeled data provided by task organizers (10 Million tweets) was split into multiple subparts, to which we applied semi-supervised learning method, and finally ran a specific ensembling process on the resulting models. This system ranked 3rd in Subtask 1 with 23.26% F1-score and 2nd in Subtask 2 with 5.75% F1-score.

Co-authors

Imade Benelallam 3

Venues

WANLP3

Fix author