Annotating Errors in a Hungarian Learner Corpus

Markus Dickinson, Scott Ledbetter


Abstract
We are developing and annotating a learner corpus of Hungarian, composed of student journals from three different proficiency levels written at Indiana University. Our annotation marks learner errors that are of different linguistic categories, including phonology, morphology, and syntax, but defining the annotation for an agglutinative language presents several issues. First, we must adapt an analysis that is centered on the morpheme rather than the word. Second, and more importantly, we see a need to distinguish errors from secondary corrections. We argue that although certain learner errors require a series of corrections to reach a target form, these secondary corrections, conditioned on those that come before, are our own adjustments that link the learner's productions to the target form and are not representative of the learner's internal grammar. In this paper, we report the annotation scheme and the principles that guide it, as well as examples illustrating its functionality and directions for expansion.
Anthology ID:
L12-1444
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1659–1664
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/758_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Markus Dickinson and Scott Ledbetter. 2012. Annotating Errors in a Hungarian Learner Corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1659–1664, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Annotating Errors in a Hungarian Learner Corpus (Dickinson & Ledbetter, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/758_Paper.pdf