Tool Calling for Arabic LLMs: Data Strategies and Instruction Tuning

Asım Ersoy; Enes Altinisik; Kareem Mohamed Darwish; Husrev Taha Sencar

doi:10.18653/v1/2025.arabicnlp-main.28

Tool Calling for Arabic LLMs: Data Strategies and Instruction Tuning

Asım Ersoy, Enes Altinisik, Kareem Mohamed Darwish, Husrev Taha Sencar

Abstract

Tool calling is a critical capability that allows Large Language Models (LLMs) to interact with external systems, significantly expanding their utility. However, research and resources for tool calling are predominantly English-centric, leaving a gap in our understanding of how to enable this functionality for other languages, such as Arabic. This paper investigates three key research questions: (1) the necessity of in-language (Arabic) tool-calling data versus relying on cross-lingual transfer, (2) the effect of general-purpose instruction tuning on tool-calling performance, and (3) the value of fine-tuning on specific, high-priority tools. To address these questions, we conduct extensive experiments using base and post-trained variants of an open-weight Arabic LLM. To enable this study, we bridge the resource gap by translating and adapting two open-source tool-calling datasets into Arabic. Our findings provide crucial insights into the optimal strategies for developing robust tool-augmented agents for Arabic.

Anthology ID:: 2025.arabicnlp-main.28
Volume:: Proceedings of The Third Arabic Natural Language Processing Conference
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Kareem Darwish, Ahmed Ali, Ibrahim Abu Farha, Samia Touileb, Imed Zitouni, Ahmed Abdelali, Sharefah Al-Ghamdi, Sakhar Alkhereyf, Wajdi Zaghouani, Salam Khalifa, Badr AlKhamissi, Rawan Almatham, Injy Hamed, Zaid Alyafeai, Areeb Alowisheq, Go Inoue, Khalil Mrini, Waad Alshammari
Venue:: ArabicNLP
SIG:: SIGARAB
Publisher:: Association for Computational Linguistics
Note:
Pages:: 347–358
Language:
URL:: https://aclanthology.org/2025.arabicnlp-main.28/
DOI:: 10.18653/v1/2025.arabicnlp-main.28
Bibkey:
Cite (ACL):: Asım Ersoy, Enes Altinisik, Kareem Mohamed Darwish, and Husrev Taha Sencar. 2025. Tool Calling for Arabic LLMs: Data Strategies and Instruction Tuning. In Proceedings of The Third Arabic Natural Language Processing Conference, pages 347–358, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Tool Calling for Arabic LLMs: Data Strategies and Instruction Tuning (Ersoy et al., ArabicNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.arabicnlp-main.28.pdf

PDF Cite Search Fix data