Guidelines for Fine-grained Sentence-level Arabic Readability Annotation

Nizar Habash; Hanada Taha-Thomure; Khalid N. Elmadani; Zeina Zeino; Abdallah Abushmaes

doi:10.18653/v1/2025.law-1.30

Guidelines for Fine-grained Sentence-level Arabic Readability Annotation

Nizar Habash, Hanada Taha-Thomure, Khalid N. Elmadani, Zeina Zeino, Abdallah Abushmaes

Abstract

This paper presents the annotation guidelines of the Balanced Arabic Readability Evaluation Corpus (BAREC), a large-scale resource for fine-grained sentence-level readability assessment in Arabic. BAREC includes 69,441 sentences (1M+ words) labeled across 19 levels, from kindergarten to postgraduate. Based on the Taha/Arabi21 framework, the guidelines were refined through iterative training with native Arabic-speaking educators. We highlight key linguistic, pedagogical, and cognitive factors in determining readability and report high inter-annotator agreement: Quadratic Weighted Kappa 81.8% (substantial/excellent agreement) in the last annotation phase. We also benchmark automatic readability models across multiple classification granularities (19-, 7-, 5-, and 3-level). The corpus and guidelines are publicly available: http://barec.camel-lab.com.

Anthology ID:: 2025.law-1.30
Volume:: Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Siyao Peng, Ines Rehbein
Venues:: LAW | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 359–376
Language:
URL:: https://aclanthology.org/2025.law-1.30/
DOI:: 10.18653/v1/2025.law-1.30
Bibkey:
Cite (ACL):: Nizar Habash, Hanada Taha-Thomure, Khalid N. Elmadani, Zeina Zeino, and Abdallah Abushmaes. 2025. Guidelines for Fine-grained Sentence-level Arabic Readability Annotation. In Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025), pages 359–376, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Guidelines for Fine-grained Sentence-level Arabic Readability Annotation (Habash et al., LAW 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.law-1.30.pdf

PDF Cite Search Fix data