JCT at SemEval-2022 Task 6-A: Sarcasm Detection in Tweets Written in English and Arabic using Preprocessing Methods and Word N-grams

Yaakov HaCohen-Kerner; Matan Fchima; Ilan Meyrowitsch

doi:10.18653/v1/2022.semeval-1.145

JCT at SemEval-2022 Task 6-A: Sarcasm Detection in Tweets Written in English and Arabic using Preprocessing Methods and Word N-grams

Yaakov HaCohen-Kerner, Matan Fchima, Ilan Meyrowitsch

Abstract

In this paper, we describe our submissions to SemEval-2022 contest. We tackled subtask 6-A - “iSarcasmEval: Intended Sarcasm Detection In English and Arabic – Binary Classification”. We developed different models for two languages: English and Arabic. We applied 4 supervised machine learning methods, 6 preprocessing methods for English and 3 for Arabic, and 3 oversampling methods. Our best submitted model for the English test dataset was a SVC model that balanced the dataset using SMOTE and removed stop words. For the Arabic test dataset our best submitted model was a SVC model that preprocessed removed longation.

Anthology ID:: 2022.semeval-1.145
Volume:: Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Guy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1031–1038
Language:
URL:: https://aclanthology.org/2022.semeval-1.145/
DOI:: 10.18653/v1/2022.semeval-1.145
Bibkey:
Cite (ACL):: Yaakov HaCohen-Kerner, Matan Fchima, and Ilan Meyrowitsch. 2022. JCT at SemEval-2022 Task 6-A: Sarcasm Detection in Tweets Written in English and Arabic using Preprocessing Methods and Word N-grams. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1031–1038, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: JCT at SemEval-2022 Task 6-A: Sarcasm Detection in Tweets Written in English and Arabic using Preprocessing Methods and Word N-grams (HaCohen-Kerner et al., SemEval 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.semeval-1.145.pdf

PDF Cite Search Fix data