Rucha Ambaliya


2025

Grammar correction for Indian languages poses significant challenges due to complex morphology, non-standard spellings, and frequent script variations. In this work, we address grammar correction for English-mixed sentences in five Indic languages—Hindi, Bengali, Malayalam, Tamil, and Telugu—as part of the IndicGEC 2025 Bhasha Workshop. Our approach first applies word-level transliteration using IndicTrans (Bhat et al., 2014) to normalize Romanized and mixed-script tokens, followed by grammar correction using the mT5-small model (Xue et al., 2021). Although our experiments focus on these five languages, the methodology is generalizable to other Indian languages. Our implementation and code are publicly available at: https://github.com/Rucha-Ambaliya/bhasha-workshop