Zilong Li
2026
Translation via Annotation: A Computational Study of Translating Classical Chinese into Japanese
Zilong Li | Jie Cao
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Zilong Li | Jie Cao
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Ancient people translated classical Chinese into Japanese using a system of annotations placed around characters. We abstract this process as sequence tagging tasks and fit them into modern language technologies. The research on this annotation and translation system faces a low resource problem. We alleviate this problem by introducing an LLM-based annotation pipeline and constructing a new dataset from digitized open-source translation data. We show that in the low-resource setting, introducing auxiliary Chinese NLP tasks enhances the training of sequence tagging tasks. We also evaluate the performance of Large Language Models (LLMs) on this task. While they achieve high scores on direct machine translation, our method could serve as a supplement to LLMs to improve the quality of character’s annotation.
2024
Annotate Chinese Aspect with UMR——a Case Study on The Little Prince
Sijia Ge | Zilong Li | Alvin Po-Chun Chen | Guanchao Wang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Sijia Ge | Zilong Li | Alvin Po-Chun Chen | Guanchao Wang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Aspect is a valuable tool for determining the perspective from which an event is observed, allowing for viewing both at the situation and viewpoint level. Uniform Meaning Representation (UMR) seeks to provide a standard, typologically-informed representation of aspects across languages. It employs an aspectual lattice to adapt to different languages and design values that encompass both viewpoint aspect and situation aspects. In the context of annotating the Chinese version of The Little Prince, we paid particular attention to the interactions between aspect values and aspect markers and we also want to know the annotation effectiveness and challenges under the UMR aspectual lattice. During our annotation process, we identified the relationships between aspectual markers and labels. We further categorized and analyzed complex examples that led to low inter-annotator agreement. The factors contributing to disagreement among annotators included the interpretations of lexical semantics, implications, and the influence of aspectual markers, which is related to the inclination of the situation aspect and the exclusivity between the two aspects’ perspectives. Overall, our work sheds light on the challenges of aspect annotation in Chinese and highlights the need for more comprehensive guidelines.