CUNI System for the Building Educational Applications 2019 Shared Task: Grammatical Error Correction

Jakub Náplava, Milan Straka


Abstract
Our submitted models are NMT systems based on the Transformer model, which we improve by incorporating several enhancements: applying dropout to whole source and target words, weighting target subwords, averaging model checkpoints, and using the trained model iteratively for correcting the intermediate translations. The system in the Restricted Track is trained on the provided corpora with oversampled “cleaner” sentences and reaches 59.39 F0.5 score on the test set. The system in the Low-Resource Track is trained from Wikipedia revision histories and reaches 44.13 F0.5 score. Finally, we finetune the system from the Low-Resource Track on restricted data and achieve 64.55 F0.5 score.
Anthology ID:
W19-4419
Volume:
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
August
Year:
2019
Address:
Florence, Italy
Venues:
ACL | BEA | WS
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
183–190
Language:
URL:
https://aclanthology.org/W19-4419
DOI:
10.18653/v1/W19-4419
Bibkey:
Cite (ACL):
Jakub Náplava and Milan Straka. 2019. CUNI System for the Building Educational Applications 2019 Shared Task: Grammatical Error Correction. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 183–190, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
CUNI System for the Building Educational Applications 2019 Shared Task: Grammatical Error Correction (Náplava & Straka, 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4419.pdf
Data
FCE