Khadija ElFqih
2024
Large Language Models as Legal Translators of Arabic Legislatives: Does ChatGPT and Gemini Care for Context and Terminology?
Khadija ElFqih
|
Johanna Monti
Proceedings of The Second Arabic Natural Language Processing Conference
Accurate translation of terminology and adaptation to in-context information is a pillar to high quality translation. Recently, there is a remarkable interest towards the use and the evaluation of Large Language Models (LLMs) particularly for Machine Translation tasks. Nevertheless, despite their recent advancement and ability to understand and generate human-like language, these LLMs are still far from perfect, especially in domain-specific scenarios, and need to be thoroughly investigated. This is particularly evident in automatically translating legal terminology from Arabic into English and French, where, beyond the inherent complexities of legal language and specialised translations, technical limitations of LLMs further hinder accurate generation of text. In this paper, we present a preliminary evaluation of two evolving LLMs, namely GPT-4 Generative Pre-trained Transformer and Gemini, as legal translators of Arabic legislatives to test their accuracy and the extent to which they care for context and terminology across two language pairs (AR→EN / AR→FR). The study targets the evaluation of Zero-Shot prompting for in-context and out-of-context scenarios of both models relying on a gold standard dataset, verified by professional translators who are also experts in the field. We evaluate the results applying the Multidimensional Quality Metrics to classify translation errors. Moreover, we also evaluate the general LLMs outputs to verify their correctness, consistency, and completeness.In general, our results show that the models are far from perfect and recall for more fine-tuning efforts using specialised terminological data in the legal domain from Arabic into English and French.
Search