AraNet: A Deep Learning Toolkit for Arabic Social Media

Muhammad Abdul-Mageed, Chiyu Zhang, Azadeh Hashemi, El Moatez Billah Nagoudi


Abstract
We describe AraNet, a collection of deep learning Arabic social media processing tools. Namely, we exploit an extensive host of both publicly available and novel social media datasets to train bidirectional encoders from transformers (BERT) focused at social meaning extraction. AraNet models predict age, dialect, gender, emotion, irony, and sentiment. AraNet either delivers state-of-the-art performance on a number of these tasks and performs competitively on others. AraNet is exclusively based on a deep learning framework, giving it the advantage of being feature-engineering free. To the best of our knowledge, AraNet is the first to performs predictions across such a wide range of tasks for Arabic NLP. As such, AraNet has the potential to meet critical needs. We publicly release AraNet to accelerate research, and to facilitate model-based comparisons across the different tasks
Anthology ID:
2020.osact-1.3
Volume:
Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Hend Al-Khalifa, Walid Magdy, Kareem Darwish, Tamer Elsayed, Hamdy Mubarak
Venue:
OSACT
SIG:
Publisher:
European Language Resource Association
Note:
Pages:
16–23
Language:
English
URL:
https://aclanthology.org/2020.osact-1.3
DOI:
Bibkey:
Cite (ACL):
Muhammad Abdul-Mageed, Chiyu Zhang, Azadeh Hashemi, and El Moatez Billah Nagoudi. 2020. AraNet: A Deep Learning Toolkit for Arabic Social Media. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pages 16–23, Marseille, France. European Language Resource Association.
Cite (Informal):
AraNet: A Deep Learning Toolkit for Arabic Social Media (Abdul-Mageed et al., OSACT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.osact-1.3.pdf
Code
 UBC-NLP/aranet