The Effect of Error Rate in Artificially Generated Data for Automatic Preposition and Determiner Correction

Fraser Bowen, Jon Dehdari, Josef van Genabith


Abstract
In this research we investigate the impact of mismatches in the density and type of error between training and test data on a neural system correcting preposition and determiner errors. We use synthetically produced training data to control error density and type, and “real” error data for testing. Our results show it is possible to combine error types, although prepositions and determiners behave differently in terms of how much error should be artificially introduced into the training data in order to get the best results.
Anthology ID:
W17-4410
Volume:
Proceedings of the 3rd Workshop on Noisy User-generated Text
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Leon Derczynski, Wei Xu, Alan Ritter, Tim Baldwin
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
68–76
Language:
URL:
https://aclanthology.org/W17-4410
DOI:
10.18653/v1/W17-4410
Bibkey:
Cite (ACL):
Fraser Bowen, Jon Dehdari, and Josef van Genabith. 2017. The Effect of Error Rate in Artificially Generated Data for Automatic Preposition and Determiner Correction. In Proceedings of the 3rd Workshop on Noisy User-generated Text, pages 68–76, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
The Effect of Error Rate in Artificially Generated Data for Automatic Preposition and Determiner Correction (Bowen et al., WNUT 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4410.pdf