Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation

Bashar Alhafni, Go Inoue, Christian Khairallah, Nizar Habash


Abstract
Grammatical error correction (GEC) is a well-explored problem in English with many existing models and datasets. However, research on GEC in morphologically rich languages has been limited due to challenges such as data scarcity and language complexity. In this paper, we present the first results on Arabic GEC using two newly developed Transformer-based pretrained sequence-to-sequence models. We also define the task of multi-class Arabic grammatical error detection (GED) and present the first results on multi-class Arabic GED. We show that using GED information as auxiliary input in GEC models improves GEC performance across three datasets spanning different genres. Moreover, we also investigate the use of contextual morphological preprocessing in aiding GEC systems. Our models achieve SOTA results on two Arabic GEC shared task datasets and establish a strong benchmark on a recently created dataset. We make our code, data, and pretrained models publicly available.
Anthology ID:
2023.emnlp-main.396
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6430–6448
Language:
URL:
https://aclanthology.org/2023.emnlp-main.396
DOI:
10.18653/v1/2023.emnlp-main.396
Bibkey:
Cite (ACL):
Bashar Alhafni, Go Inoue, Christian Khairallah, and Nizar Habash. 2023. Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6430–6448, Singapore. Association for Computational Linguistics.
Cite (Informal):
Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation (Alhafni et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.396.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.396.mp4