On Assessing and Developing Spoken ’Grammatical Error Correction’ Systems

Yiting Lu, Stefano Bannò, Mark Gales


Abstract
Spoken ‘grammatical error correction’ (SGEC) is an important process to provide feedback for second language learning. Due to a lack of end-to-end training data, SGEC is often implemented as a cascaded, modular system, consisting of speech recognition, disfluency removal, and grammatical error correction (GEC). This cascaded structure enables efficient use of training data for each module. It is, however, difficult to compare and evaluate the performance of individual modules as preceeding modules may introduce errors. For example the GEC module input depends on the output of non-native speech recognition and disfluency detection, both challenging tasks for learner data.This paper focuses on the assessment and development of SGEC systems. We first discuss metrics for evaluating SGEC, both individual modules and the overall system. The system-level metrics enable tuning for optimal system performance. A known issue in cascaded systems is error propagation between modules.To mitigate this problem semi-supervised approaches and self-distillation are investigated. Lastly, when SGEC system gets deployed it is important to give accurate feedback to users. Thus, we apply filtering to remove edits with low-confidence, aiming to improve overall feedback precision. The performance metrics are examined on a Linguaskill multi-level data set, which includes the original non-native speech, manual transcriptions and reference grammatical error corrections, to enable system analysis and development.
Anthology ID:
2022.bea-1.9
Volume:
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)
Month:
July
Year:
2022
Address:
Seattle, Washington
Venues:
BEA | NAACL
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
51–60
Language:
URL:
https://aclanthology.org/2022.bea-1.9
DOI:
10.18653/v1/2022.bea-1.9
Bibkey:
Cite (ACL):
Yiting Lu, Stefano Bannò, and Mark Gales. 2022. On Assessing and Developing Spoken ’Grammatical Error Correction’ Systems. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 51–60, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
On Assessing and Developing Spoken ’Grammatical Error Correction’ Systems (Lu et al., BEA 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.bea-1.9.pdf