Keying Li


2018

pdf bib
L1-L2 Parallel Treebank of Learner Chinese: Overused and Underused Syntactic Structures
Keying Li | John Lee
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Towards Universal Dependencies for Learner Chinese
John Lee | Herman Leung | Keying Li
Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies (UDW 2017)

pdf bib
L1-L2 Parallel Dependency Treebank as Learner Corpus
John Lee | Keying Li | Herman Leung
Proceedings of the 15th International Conference on Parsing Technologies

This opinion paper proposes the use of parallel treebank as learner corpus. We show how an L1-L2 parallel treebank — i.e., parse trees of non-native sentences, aligned to the parse trees of their target hypotheses — can facilitate retrieval of sentences with specific learner errors. We argue for its benefits, in terms of corpus re-use and interoperability, over a conventional learner corpus annotated with error tags. As a proof of concept, we conduct a case study on word-order errors made by learners of Chinese as a foreign language. We report precision and recall in retrieving a range of word-order error categories from L1-L2 tree pairs annotated in the Universal Dependency framework.

pdf bib
Automatic Difficulty Assessment for Chinese Texts
John Lee | Meichun Liu | Chun Yin Lam | Tak On Lau | Bing Li | Keying Li
Proceedings of the IJCNLP 2017, System Demonstrations

We present a web-based interface that automatically assesses reading difficulty of Chinese texts. The system performs word segmentation, part-of-speech tagging and dependency parsing on the input text, and then determines the difficulty levels of the vocabulary items and grammatical constructions in the text. Furthermore, the system highlights the words and phrases that must be simplified or re-written in order to conform to the user-specified target difficulty level. Evaluation results show that the system accurately identifies the vocabulary level of 89.9% of the words, and detects grammar points at 0.79 precision and 0.83 recall.