Processing Dialectal Arabic: Exploiting Variability and Similarity to Overcome Challenges and Discover Opportunities

Mona Diab


Abstract
We recently witnessed an exponential growth in dialectal Arabic usage in both textual data and speech recordings especially in social media. Processing such media is of great utility for all kinds of applications ranging from information extraction to social media analytics for political and commercial purposes to building decision support systems. Compared to other languages, Arabic, especially the informal variety, poses a significant challenge to natural language processing algorithms since it comprises multiple dialects, linguistic code switching, and a lack of standardized orthographies, to top its relatively complex morphology. Inherently, the problem of processing Arabic in the context of social media is the problem of how to handle resource poor languages. In this talk I will go over some of our insights to some of these problems and show how there is a silver lining where we can generalize some of our solutions to other low resource language contexts.
Anthology ID:
W16-4805
Volume:
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi
Venue:
VarDial
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
42
Language:
URL:
https://aclanthology.org/W16-4805
DOI:
Bibkey:
Cite (ACL):
Mona Diab. 2016. Processing Dialectal Arabic: Exploiting Variability and Similarity to Overcome Challenges and Discover Opportunities. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), page 42, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Processing Dialectal Arabic: Exploiting Variability and Similarity to Overcome Challenges and Discover Opportunities (Diab, VarDial 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-4805.pdf