Collecting fluency corrections for spoken learner English

Andrew Caines, Emma Flint, Paula Buttery


Abstract
We present crowdsourced collection of error annotations for transcriptions of spoken learner English. Our emphasis in data collection is on fluency corrections, a more complete correction than has traditionally been aimed for in grammatical error correction research (GEC). Fluency corrections require improvements to the text, taking discourse and utterance level semantics into account: the result is a more naturalistic, holistic version of the original. We propose that this shifted emphasis be reflected in a new name for the task: ‘holistic error correction’ (HEC). We analyse crowdworker behaviour in HEC and conclude that the method is useful with certain amendments for future work.
Anthology ID:
W17-5010
Volume:
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Joel Tetreault, Jill Burstein, Claudia Leacock, Helen Yannakoudakis
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
91–100
Language:
URL:
https://aclanthology.org/W17-5010
DOI:
10.18653/v1/W17-5010
Bibkey:
Cite (ACL):
Andrew Caines, Emma Flint, and Paula Buttery. 2017. Collecting fluency corrections for spoken learner English. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 91–100, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Collecting fluency corrections for spoken learner English (Caines et al., BEA 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-5010.pdf
Data
CoNLL-2014 Shared Task: Grammatical Error CorrectionFCE