MizanQA: A Benchmark for Multi-Answer Moroccan Legal QA

Adil Bahaj; Mounir Ghogho

MizanQA: A Benchmark for Multi-Answer Moroccan Legal QA

Abstract

We present MizanQA, a benchmark for assessing LLMs on Moroccan legal MCQs, many with multiple correct answers. Covering 1,776 expert-verified questions in Modern Standard Arabic enriched with Moroccan idioms, the dataset reflects influences from Maliki jurisprudence, customary law, and French legal traditions. Unlike single-answer settings, MizanQA features variable option counts, creating added difficulty. We evaluate multilingual and Arabic-centric models in zero-shot, native-Arabic prompts, measuring accuracy, a precision-penalized F1-like score, and calibration errors. Results show large performance gaps and miscalibration, particularly under stricter penalties. By scoping this benchmark to parametric knowledge only, we provide a baseline for future retrieval-augmented and rationale-focused setups.

Anthology ID:: 2026.eacl-industry.10
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Yevgen Matusevych, Gülşen Eryiğit, Nikolaos Aletras
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 132–144
Language:
URL:: https://aclanthology.org/2026.eacl-industry.10/
DOI:
Bibkey:
Cite (ACL):: Adil Bahaj and Mounir Ghogho. 2026. MizanQA: A Benchmark for Multi-Answer Moroccan Legal QA. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track), pages 132–144, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: MizanQA: A Benchmark for Multi-Answer Moroccan Legal QA (Bahaj & Ghogho, EACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.eacl-industry.10.pdf

PDF Cite Search Fix data