Khoula Al-Kharusi


2023

pdf bib
Machine Translation of Omani Arabic Dialect from Social Media
Khoula Al-Kharusi | Abdurahman AAlAbdulsalam
Proceedings of ArabicNLP 2023

Research studies on Machine Translation (MT) between Modern Standard Arabic (MSA) and English are abundant. However, studies on MT between Omani Arabic (OA) dialects and English are very scarce. This research study focuses on the lack of availability of an Omani dialect parallel dataset, as well as MT of OA to English. The study uses social media data from X (formerly Twitter) to build an authentic parallel text of the Omani dialects. The research presents baseline results on this dataset using Google Translate, Microsoft Translation, and Marian NMT. A taxonomy of the most common linguistic errors is used to analyze the translations made by the NMT systems to provide insights on future improvements. Finally, transfer learning is used to adapt Marian NMT to the Omani dialect, which significantly improved by 9.88 points in the BLEU score.