Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario

Lena Keiper, Andrea Horbach, Stefan Thater


Abstract
We present a novel method to automatically improve the accurracy of part-of-speech taggers on learner language. The key idea underlying our approach is to exploit the structure of a typical language learner task and automatically induce POS information for out-of-vocabulary (OOV) words. To evaluate the effectiveness of our approach, we add manual POS and normalization information to an existing language learner corpus. Our evaluation shows an increase in accurracy from 72.4% to 81.5% on OOV words.
Anthology ID:
L16-1030
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
198–205
Language:
URL:
https://aclanthology.org/L16-1030
DOI:
Bibkey:
Cite (ACL):
Lena Keiper, Andrea Horbach, and Stefan Thater. 2016. Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 198–205, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario (Keiper et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1030.pdf