Character-level Dependency Annotation of Chinese

Li Yixuan


Abstract
In this paper, we propose a new model for annotating dependency relations at the Mandarin character level with the aim of building treebanks to cope with the unsatisfactory performance of existing word segmentation and syntactic analysis models in specific scientific domains, such as Chinese patent texts. The result is a treebank of 100 sentences annotated according to our scheme, which also serves as a training corpus that facilitates the subsequent development of a joint word segmenter and dependency analyzer that enables downstream tasks in Chinese to be separated from the non-standardized pre-processing step of word segmentation.
Anthology ID:
2023.depling-1.5
Volume:
Proceedings of the Seventh International Conference on Dependency Linguistics (Depling, GURT/SyntaxFest 2023)
Month:
March
Year:
2023
Address:
Washington, D.C.
Editors:
Owen Rambow, François Lareau
Venues:
DepLing | SyntaxFest
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
42–53
Language:
URL:
https://aclanthology.org/2023.depling-1.5
DOI:
Bibkey:
Cite (ACL):
Li Yixuan. 2023. Character-level Dependency Annotation of Chinese. In Proceedings of the Seventh International Conference on Dependency Linguistics (Depling, GURT/SyntaxFest 2023), pages 42–53, Washington, D.C.. Association for Computational Linguistics.
Cite (Informal):
Character-level Dependency Annotation of Chinese (Yixuan, DepLing-SyntaxFest 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.depling-1.5.pdf
Video:
 https://aclanthology.org/2023.depling-1.5.mp4