Can LLMs Translate Cultural Nuance in Dialects? A Case Study on Lebanese Arabic

Silvana Yakhni; Ali Chehab

Can LLMs Translate Cultural Nuance in Dialects? A Case Study on Lebanese Arabic

Abstract

Machine Translation (MT) of Arabic-script languages presents unique challenges due to their vast linguistic diversity and lack of standardization. This paper focuses on the Lebanese dialect, investigating the effectiveness of Large Language Models (LLMs) in handling culturally-aware translations. We identify critical limitations in existing Lebanese-English parallel datasets, particularly their non-native nature and lack of cultural context. To address these gaps, we introduce a new culturally-rich dataset derived from the Language Wave (LW) podcast. We evaluate the performance of LLMs: Jais, AceGPT, Cohere, and GPT-4 models against Neural Machine Translation (NMT) systems: NLLB-200, and Google Translate. Our findings reveal that while both architectures perform similarly on non-native datasets, LLMs demonstrate superior capabilities in preserving cultural nuances when handling authentic Lebanese content. Additionally, we validate xCOMET as a reliable metric for evaluating the quality of Arabic dialect translation, showing a strong correlation with human judgment. This work contributes to the growing field of Culturally-Aware Machine Translation and highlights the importance of authentic, culturally representative datasets in advancing low-resource translation systems.

Anthology ID:: 2025.abjadnlp-1.13
Volume:: Proceedings of the 1st Workshop on NLP for Languages Using Arabic Script
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editor:: Mo El-Haj
Venues:: AbjadNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 114–135
Language:
URL:: https://aclanthology.org/2025.abjadnlp-1.13/
DOI:
Bibkey:
Cite (ACL):: Silvana Yakhni and Ali Chehab. 2025. Can LLMs Translate Cultural Nuance in Dialects? A Case Study on Lebanese Arabic. In Proceedings of the 1st Workshop on NLP for Languages Using Arabic Script, pages 114–135, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Can LLMs Translate Cultural Nuance in Dialects? A Case Study on Lebanese Arabic (Yakhni & Chehab, AbjadNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.abjadnlp-1.13.pdf

PDF Cite Search Fix data