Carmel Lee Hah Heah


2017

pdf bib
NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students
Roger Vivek Placidus Winder | Joseph MacKinnon | Shu Yun Li | Benedict Christopher Tzer Liang Lin | Carmel Lee Hah Heah | Luís Morgado da Costa | Takayuki Kuribayashi | Francis Bond
Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017)

This paper describes the creation of a new annotated learner corpus. The aim is to use this corpus to develop an automated system for corrective feedback on students’ writing. With this system, students will be able to receive timely feedback on language errors before they submit their assignments for grading. A corpus of assignments submitted by first year engineering students was compiled, and a new error tag set for the NTU Corpus of Learner English (NTUCLE) was developed based on that of the NUS Corpus of Learner English (NUCLE), as well as marking rubrics used at NTU. After a description of the corpus, error tag set and annotation process, the paper presents the results of the annotation exercise as well as follow up actions. The final error tag set, which is significantly larger than that for the NUCLE error categories, is then presented before a brief conclusion summarising our experience and future plans.