Ayman Alhelbawy
2020
The QMUL/HRBDT contribution to the NADI Arabic Dialect Identification Shared Task
Abdulrahman Aloraini | Massimo Poesio | Ayman Alhelbawy
Proceedings of the Fifth Arabic Natural Language Processing Workshop
Abdulrahman Aloraini | Massimo Poesio | Ayman Alhelbawy
Proceedings of the Fifth Arabic Natural Language Processing Workshop
We present the Arabic dialect identification system that we used for the country-level subtask of the NADI challenge. Our model consists of three components: BiLSTM-CNN, character-level TF-IDF, and topic modeling features. We represent each tweet using these features and feed them into a deep neural network. We then add an effective heuristic that improves the overall performance. We achieved an F1-Macro score of 20.77% and an accuracy of 34.32% on the test set. The model was also evaluated on the Arabic Online Commentary dataset, achieving results better than the state-of-the-art.
2016
Towards a Corpus of Violence Acts in Arabic Social Media
Ayman Alhelbawy | Udo Kruschwitz | Massimo Poesio
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Ayman Alhelbawy | Udo Kruschwitz | Massimo Poesio
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
In this paper we present a new corpus of Arabic tweets that mention some form of violent event, developed to support the automatic identification of Human Rights Abuse. The dataset was manually labelled for seven classes of violence using crowdsourcing.