Grammatical Error Annotation for Korean Learners of Spoken English

Hongsuck Seo, Kyusong Lee, Gary Geunbae Lee, Soo-Ok Kweon, Hae-Ri Kim


Abstract
The goal of our research is to build a grammatical error-tagged corpus for Korean learners of Spoken English dubbed Postech Learner Corpus. We collected raw story-telling speech from Korean university students. Transcription and annotation using the Cambridge Learner Corpus tagset were performed by six Korean annotators fluent in English. For the annotation of the corpus, we developed an annotation tool and a validation tool. After comparing human annotation with machine-recommended error tags, unmatched errors were rechecked by a native annotator. We observed different characteristics between the spoken language corpus built in this study and an existing written language corpus.
Anthology ID:
L12-1035
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1628–1631
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/168_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Hongsuck Seo, Kyusong Lee, Gary Geunbae Lee, Soo-Ok Kweon, and Hae-Ri Kim. 2012. Grammatical Error Annotation for Korean Learners of Spoken English. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1628–1631, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Grammatical Error Annotation for Korean Learners of Spoken English (Seo et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/168_Paper.pdf