An Extended Sequence Tagging Vocabulary for Grammatical Error Correction

Stuart Mesham, Christopher Bryant, Marek Rei, Zheng Yuan


Abstract
We extend a current sequence-tagging approach to Grammatical Error Correction (GEC) by introducing specialised tags for spelling correction and morphological inflection using the SymSpell and LemmInflect algorithms. Our approach improves generalisation: the proposed new tagset allows a smaller number of tags to correct a larger range of errors. Our results show a performance improvement both overall and in the targeted error categories. We further show that ensembles trained with our new tagset outperform those trained with the baseline tagset on the public BEA benchmark.
Anthology ID:
2023.findings-eacl.119
Volume:
Findings of the Association for Computational Linguistics: EACL 2023
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Andreas Vlachos, Isabelle Augenstein
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1608–1619
Language:
URL:
https://aclanthology.org/2023.findings-eacl.119
DOI:
10.18653/v1/2023.findings-eacl.119
Bibkey:
Cite (ACL):
Stuart Mesham, Christopher Bryant, Marek Rei, and Zheng Yuan. 2023. An Extended Sequence Tagging Vocabulary for Grammatical Error Correction. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1608–1619, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
An Extended Sequence Tagging Vocabulary for Grammatical Error Correction (Mesham et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-eacl.119.pdf
Software:
 2023.findings-eacl.119.software.zip
Video:
 https://aclanthology.org/2023.findings-eacl.119.mp4