Abdessamad Benlahbib


2022

pdf bib
High Tech team at SemEval-2022 Task 6: Intended Sarcasm Detection for Arabic texts
Hamza Alami | Abdessamad Benlahbib | Ahmed Alami
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper presents our proposed methods for the iSarcasmEval shared task. The shared task consists of three different subtasks. We participate in both subtask A and subtask C. The purpose of subtask A was to predict if a text is sarcastic while the aim of subtask C is to determine which text is sarcastic given a sarcastic text and its non-sarcastic rephrase. Both of the developed solutions used BERT pre-trained models. The proposed models are optimized on simple objectives and are easy to grasp. However, despite their simplicity, our methods ranked 4 and 2 in iSarcasmEval subtask A and subtask C for Arabic texts.

pdf bib
LISACTeam at SemEval-2022 Task 6: A Transformer based Approach for Intended Sarcasm Detection in English Tweets
Abdessamad Benlahbib | Hamza Alami | Ahmed Alami
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

In this paper, we present our system and findings for SemEval-2022 Task 6 - iSarcasmEval: Intended Sarcasm Detection in English. The main objective of this task was to identify sarcastic tweets. This task was challenging mainly due to (1) the small training dataset that contains only 3468 tweets and (2) the imbalanced class distribution (25% sarcastic and 75% non-sarcastic). Our submitted model (ranked eighth on Sub-Task A and fifth on Sub-Task C) consists of a Transformer-based approach (BERTweet model).

2021

pdf bib
LISAC FSDM USMBA at SemEval-2021 Task 5: Tackling Toxic Spans Detection Challenge with Supervised SpanBERT-based Model and Unsupervised LIME-based Model
Abdessamad Benlahbib | Ahmed Alami | Hamza Alami
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Toxic spans detection is an emerging challenge that aims to find toxic spans within a toxic text. In this paper, we describe our solutions to tackle toxic spans detection. The first solution, which follows a supervised approach, is based on SpanBERT model. This latter is intended to better embed and predict spans of text. The second solution, which adopts an unsupervised approach, combines linear support vector machine with the Local Interpretable Model-Agnostic Explanations (LIME). This last is used to interpret predictions of learning-based models. Our supervised model outperformed the unsupervised model and achieved the f-score of 67,84% (ranked 22/85) in Task 5 at SemEval-2021: Toxic Spans Detection.

2020

pdf bib
LISAC FSDM-USMBA Team at SemEval-2020 Task 12: Overcoming AraBERT’s pretrain-finetune discrepancy for Arabic offensive language identification
Hamza Alami | Said Ouatik El Alaoui | Abdessamad Benlahbib | Noureddine En-nahnahi
Proceedings of the Fourteenth Workshop on Semantic Evaluation

AraBERT is an Arabic version of the state-of-the-art Bidirectional Encoder Representations from Transformers (BERT) model. The latter has achieved good performance in a variety of Natural Language Processing (NLP) tasks. In this paper, we propose an effective AraBERT embeddings-based method for dealing with offensive Arabic language in Twitter. First, we pre-process tweets by handling emojis and including their Arabic meanings. To overcome the pretrain-finetune discrepancy, we substitute each detected emojis by the special token [MASK] into both fine tuning and inference phases. Then, we represent tweets tokens by applying AraBERT model. Finally, we feed the tweet representation into a sigmoid function to decide whether a tweet is offensive or not. The proposed method achieved the best results on OffensEval 2020: Arabic task and reached a macro F1 score equal to 90.17%.