Mohammad Ali


2022

pdf bib
A Survey of Computational Framing Analysis Approaches
Mohammad Ali | Naeemul Hassan
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Framing analysis is predominantly qualitative and quantitative, examining a small dataset with manual coding. Easy access to digital data in the last two decades prompts scholars in both computation and social sciences to utilize various computational methods to explore frames in large-scale datasets. The growing scholarship, however, lacks a comprehensive understanding and resources of computational framing analysis methods. Aiming to address the gap, this article surveys existing computational framing analysis approaches and puts them together. The research is expected to help scholars and journalists gain a deeper understanding of how frames are being explored computationally, better equip them to analyze frames in large-scale datasets, and, finally, work on advancing methodological approaches.

2020

pdf bib
Multi-dialect Arabic BERT for Country-level Dialect Identification
Bashar Talafha | Mohammad Ali | Muhy Eddin Za’ter | Haitham Seelawi | Ibraheem Tuffaha | Mostafa Samir | Wael Farhan | Hussein Al-Natsheh
Proceedings of the Fifth Arabic Natural Language Processing Workshop

Arabic dialect identification is a complex problem for a number of inherent properties of the language itself. In this paper, we present the experiments conducted, and the models developed by our competing team, Mawdoo3 AI, along the way to achieving our winning solution to subtask 1 of the Nuanced Arabic Dialect Identification (NADI) shared task. The dialect identification subtask provides 21,000 country-level labeled tweets covering all 21 Arab countries. An unlabeled corpus of 10M tweets from the same domain is also presented by the competition organizers for optional use. Our winning solution itself came in the form of an ensemble of different training iterations of our pre-trained BERT model, which achieved a micro-averaged F1-score of 26.78% on the subtask at hand. We publicly release the pre-trained language model component of our winning solution under the name of Multi-dialect-Arabic-BERT model, for any interested researcher out there.