Annotating Orthographic Target Hypotheses in a German L1 Learner Corpus

Ronja Laarmann-Quante; Katrin Ortmann; Anna Ehlert; Maurice Vogel; Stefanie Dipper

doi:10.18653/v1/W17-5051

Annotating Orthographic Target Hypotheses in a German L1 Learner Corpus

Ronja Laarmann-Quante, Katrin Ortmann, Anna Ehlert, Maurice Vogel, Stefanie Dipper

Abstract

NLP applications for learners often rely on annotated learner corpora. Thereby, it is important that the annotations are both meaningful for the task, and consistent and reliable. We present a new longitudinal L1 learner corpus for German (handwritten texts collected in grade 2–4), which is transcribed and annotated with a target hypothesis that strictly only corrects orthographic errors, and is thereby tailored to research and tool development for orthographic issues in primary school. While for most corpora, transcription and target hypothesis are not evaluated, we conducted a detailed inter-annotator agreement study for both tasks. Although we achieved high agreement, our discussion of cases of disagreement shows that even with detailed guidelines, annotators differ here and there for different reasons, which should also be considered when working with transcriptions and target hypotheses of other corpora, especially if no explicit guidelines for their construction are known.

Anthology ID:: W17-5051
Volume:: Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications
Month:: September
Year:: 2017
Address:: Copenhagen, Denmark
Editors:: Joel Tetreault, Jill Burstein, Claudia Leacock, Helen Yannakoudakis
Venue:: BEA
SIG:: SIGEDU
Publisher:: Association for Computational Linguistics
Note:
Pages:: 444–456
Language:
URL:: https://aclanthology.org/W17-5051/
DOI:: 10.18653/v1/W17-5051
Bibkey:
Cite (ACL):: Ronja Laarmann-Quante, Katrin Ortmann, Anna Ehlert, Maurice Vogel, and Stefanie Dipper. 2017. Annotating Orthographic Target Hypotheses in a German L1 Learner Corpus. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 444–456, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):: Annotating Orthographic Target Hypotheses in a German L1 Learner Corpus (Laarmann-Quante et al., BEA 2017)
Copy Citation:
PDF:: https://aclanthology.org/W17-5051.pdf

PDF Cite Search Fix data