AMIYA Shared Task: Arabic Modeling In Your Accent at VarDial 2026

Nathaniel R. Robinson; Shahd Abdelmoneim; Anjali Kantharuban; Otba Alsboul; Salima Lamsiyah; Kelly Marchisio; Kenton Murray

AMIYA Shared Task: Arabic Modeling In Your Accent at VarDial 2026

Nathaniel R. Robinson, Shahd Abdelmoneim, Anjali Kantharuban, Otba Alsboul, Salima Lamsiyah, Kelly Marchisio, Kenton Murray

Abstract

Arabic, often considered a single language, actually describes a wide variety of sometimes mutually unintelligible language varieties. While large language models (LLMs) have revolutionized natural language processing (NLP) with rapid advances, these models still best serve speakers of high-resource and standard language varieties. One particular deficiency of theirs is in dialectal Arabic. We present the first ever shared task for dialectal Arabic language modeling: Arabic Modeling In Your Accent, or AMIYA. The goal of the shared task was to develop LLMs that could (1) respond in the correct dialectal variety when explicitly or implicitly prompted to, (2) translate between dialectal Arabic and standard Arabic or English, (3) adhere to LLM instructions in dialectal Arabic, and (4) produce fluent Arabic outputs. We called for submissions in the dialectal varieties of five countries: Morocco, Egypt, Palestine, Syria, and Saudi Arabia. We received 45 submitted systems from six participating teams. We saw positive results from supervised fine-tuning on a translation objective, and reinforcement learning to improve dialectness. Manual evaluation also showed that some systems had learned to output dialectal words or phrases, but at the expense of actual fluency or coherence. Overall the most effective system involved continual pre-training and supervised fine-tuning of 12 candidate LLMs, followed by selection of the best performing models.

Anthology ID:: 2026.vardial-1.1
Volume:: Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Venues:: VarDial | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1–17
Language:
URL:: https://aclanthology.org/2026.vardial-1.1/
DOI:
Bibkey:
Cite (ACL):: Nathaniel R. Robinson, Shahd Abdelmoneim, Anjali Kantharuban, Otba Alsboul, Salima Lamsiyah, Kelly Marchisio, and Kenton Murray. 2026. AMIYA Shared Task: Arabic Modeling In Your Accent at VarDial 2026. In Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects, pages 1–17, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: AMIYA Shared Task: Arabic Modeling In Your Accent at VarDial 2026 (Robinson et al., VarDial 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.vardial-1.1.pdf

PDF Cite Search Fix data