Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory

Vrund Dobariya; Jatayu Baxi; Bhavika Gambhava; Brijesh Bhatt

Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory

Vrund Dobariya, Jatayu Baxi, Bhavika Gambhava, Brijesh Bhatt

Abstract

Grammatical Error Correction (GEC) is a fundamental task in Natural Language Processing that focuses on automatically detecting and correcting grammatical errors in text. In this paper, we present a novel approach for GEC for Gujarati. Gujarati is an Indian language spoken by over 55 million people worldwide. Our approach combines a large language model with non-parametric memory modules to address the low-resource challenge. We have evaluated our system on human-annotated and synthetic datasets. The overall result indicates promising results for Gujarati. The proposed approach is generic enough to be adopted by other languages. Furthermore, we release a publicly available evaluation dataset for Gujarati GEC along with an adapted version of the ERRANT framework to enable error-type-wise evaluation in Gujarati.

Anthology ID:: 2025.findings-ijcnlp.28
Volume:: Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:: December
Year:: 2025
Address:: Mumbai, India
Editors:: Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
Venue:: Findings
SIG:
Publisher:: The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
Note:
Pages:: 473–485
Language:
URL:: https://aclanthology.org/2025.findings-ijcnlp.28/
DOI:
Bibkey:
Cite (ACL):: Vrund Dobariya, Jatayu Baxi, Bhavika Gambhava, and Brijesh Bhatt. 2025. Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 473–485, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
Cite (Informal):: Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory (Dobariya et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-ijcnlp.28.pdf

PDF Cite Search Fix data